2025-12-04T11:10:55.7718942Z Current runner version: '2.329.0'
2025-12-04T11:10:55.7721935Z Runner name: 'linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv'
2025-12-04T11:10:55.7722343Z Runner group name: 'default'
2025-12-04T11:10:55.7722759Z Machine name: 'linux'
2025-12-04T11:10:55.7723885Z ##[group]GITHUB_TOKEN Permissions
2025-12-04T11:10:55.7724973Z Contents: read
2025-12-04T11:10:55.7725216Z Metadata: read
2025-12-04T11:10:55.7725482Z ##[endgroup]
2025-12-04T11:10:55.7726529Z Secret source: Actions
2025-12-04T11:10:55.7726821Z Prepare workflow directory
2025-12-04T11:10:55.7969291Z Prepare all required actions
2025-12-04T11:10:55.7989337Z Getting action download info
2025-12-04T11:10:56.2251098Z Download action repository 'pytorch/pytorch@main' (SHA:c0cb6e78404416d418350632bfc554710a5f7281)
2025-12-04T11:10:59.9020029Z Download action repository 'pytorch/test-infra@main' (SHA:39aa74d619174326f4e2fb0e216151c2f29d9ffd)
2025-12-04T11:11:01.0509058Z Download action repository 'actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02)
2025-12-04T11:11:01.8977373Z Download action repository 'aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722' (SHA:ececac1a45f3b08a01d2dd070d28d111c5fe6722)
2025-12-04T11:11:02.7275798Z Getting action download info
2025-12-04T11:11:02.9237678Z Download action repository 'actions/checkout@v4' (SHA:34e114876b0b11c390a56381ad16ebd13914f8d5)
2025-12-04T11:11:03.7297003Z Getting action download info
2025-12-04T11:11:03.9131206Z Download action repository 'nick-fields/retry@v3.0.0' (SHA:7152eba30c6575329ac0576536151aca5a72780e)
2025-12-04T11:11:04.6883677Z Getting action download info
2025-12-04T11:11:04.8880007Z Uses: pytorch/pytorch/.github/workflows/_rocm-test.yml@refs/heads/main (ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32)
2025-12-04T11:11:04.8882101Z ##[group] Inputs
2025-12-04T11:11:04.8882275Z   build-environment: linux-noble-rocm-py3.12-mi300
2025-12-04T11:11:04.8883618Z   test-matrix: {"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}
2025-12-04T11:11:04.8885156Z   docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:04.8885469Z   sync-tag: 
2025-12-04T11:11:04.8885987Z   timeout-minutes: 300
2025-12-04T11:11:04.8886091Z   tests-to-include: 
2025-12-04T11:11:04.8886195Z   dashboard-tag: 
2025-12-04T11:11:04.8886425Z   disable-monitor: true
2025-12-04T11:11:04.8886553Z   monitor-log-interval: 5
2025-12-04T11:11:04.8886683Z   monitor-data-collect-interval: 1
2025-12-04T11:11:04.8886807Z ##[endgroup]
2025-12-04T11:11:04.8887058Z Complete job name: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:11:04.9373905Z ##[group]Run pytorch/pytorch/.github/actions/checkout-pytorch@main
2025-12-04T11:11:04.9374395Z with:
2025-12-04T11:11:04.9374547Z   no-sudo: true
2025-12-04T11:11:04.9374975Z   submodules: recursive
2025-12-04T11:11:04.9375124Z   fetch-depth: 0
2025-12-04T11:11:04.9375347Z env:
2025-12-04T11:11:04.9375511Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:04.9375716Z ##[endgroup]
2025-12-04T11:11:04.9463346Z ##[group]Run echo "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT"
2025-12-04T11:11:04.9463937Z [36;1mecho "IN_CONTAINER_RUNNER=$(if [ -f /.inarc ] || [ -f /.incontainer ]; then echo true ; else echo false; fi)" >> "$GITHUB_OUTPUT"[0m
2025-12-04T11:11:04.9472630Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:04.9472853Z env:
2025-12-04T11:11:04.9472984Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:04.9473132Z ##[endgroup]
2025-12-04T11:11:04.9629888Z ##[group]Run actions/checkout@v4
2025-12-04T11:11:04.9630062Z with:
2025-12-04T11:11:04.9630178Z   ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:04.9630313Z   fetch-depth: 0
2025-12-04T11:11:04.9630410Z   submodules: recursive
2025-12-04T11:11:04.9630590Z   show-progress: false
2025-12-04T11:11:04.9630696Z   repository: pytorch/pytorch
2025-12-04T11:11:04.9630869Z   token: ***
2025-12-04T11:11:04.9630956Z   ssh-strict: true
2025-12-04T11:11:04.9631042Z   ssh-user: git
2025-12-04T11:11:04.9631136Z   persist-credentials: true
2025-12-04T11:11:04.9631240Z   clean: true
2025-12-04T11:11:04.9631336Z   sparse-checkout-cone-mode: true
2025-12-04T11:11:04.9631451Z   fetch-tags: false
2025-12-04T11:11:04.9631541Z   lfs: false
2025-12-04T11:11:04.9631628Z   set-safe-directory: true
2025-12-04T11:11:04.9631727Z env:
2025-12-04T11:11:04.9631811Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:04.9631911Z ##[endgroup]
2025-12-04T11:11:05.0168356Z Syncing repository: pytorch/pytorch
2025-12-04T11:11:05.0168971Z ##[group]Getting Git version info
2025-12-04T11:11:05.0169142Z Working directory is '/home/runner/_work/pytorch/pytorch'
2025-12-04T11:11:05.0169392Z [command]/usr/bin/git version
2025-12-04T11:11:05.0169500Z git version 2.52.0
2025-12-04T11:11:05.0169880Z ##[endgroup]
2025-12-04T11:11:05.0172984Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/14ae3c61-701b-4715-81e3-50a9739370e1/.gitconfig'
2025-12-04T11:11:05.0177629Z Temporarily overriding HOME='/home/runner/_work/_temp/14ae3c61-701b-4715-81e3-50a9739370e1' before making global git config changes
2025-12-04T11:11:05.0177965Z Adding repository directory to the temporary git global config as a safe directory
2025-12-04T11:11:05.0179968Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch
2025-12-04T11:11:05.0204486Z [command]/usr/bin/git config --local --get remote.origin.url
2025-12-04T11:11:05.0230329Z https://github.com/pytorch/pytorch
2025-12-04T11:11:05.0245536Z ##[group]Removing previously created refs, to avoid conflicts
2025-12-04T11:11:05.0249314Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD
2025-12-04T11:11:05.0275819Z refs/heads/main
2025-12-04T11:11:05.0286813Z [command]/usr/bin/git checkout --detach
2025-12-04T11:11:06.7644138Z HEAD is now at c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452)
2025-12-04T11:11:06.7692079Z [command]/usr/bin/git branch --delete --force main
2025-12-04T11:11:06.7854912Z Deleted branch main (was c0cb6e784044).
2025-12-04T11:11:06.7862148Z ##[endgroup]
2025-12-04T11:11:06.7866795Z [command]/usr/bin/git submodule status
2025-12-04T11:11:06.8116839Z  7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe)
2025-12-04T11:11:06.8158659Z  4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081)
2025-12-04T11:11:06.8202397Z  b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327)
2025-12-04T11:11:06.8259735Z  c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0)
2025-12-04T11:11:06.8325455Z  3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93)
2025-12-04T11:11:06.8396564Z  1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600)
2025-12-04T11:11:06.8777914Z  51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656)
2025-12-04T11:11:06.8811595Z  01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101)
2025-12-04T11:11:06.8833103Z  299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3)
2025-12-04T11:11:06.8895947Z  7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d)
2025-12-04T11:11:06.8995956Z  89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0)
2025-12-04T11:11:06.9093571Z  f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30)
2025-12-04T11:11:06.9130054Z  0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c)
2025-12-04T11:11:06.9214994Z  f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1)
2025-12-04T11:11:06.9247895Z  c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39)
2025-12-04T11:11:06.9313650Z  979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4)
2025-12-04T11:11:06.9338962Z  a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23)
2025-12-04T11:11:06.9678879Z  407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0)
2025-12-04T11:11:06.9736913Z  3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17)
2025-12-04T11:11:06.9821749Z  54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0)
2025-12-04T11:11:06.9958464Z  52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108)
2025-12-04T11:11:07.0009505Z  719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1)
2025-12-04T11:11:07.0045676Z  dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5)
2025-12-04T11:11:07.0161557Z  31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main)
2025-12-04T11:11:07.0175826Z  d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0)
2025-12-04T11:11:07.0190111Z  fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4)
2025-12-04T11:11:07.0212973Z  55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0)
2025-12-04T11:11:07.0422476Z  e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0)
2025-12-04T11:11:07.0442195Z  a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2)
2025-12-04T11:11:07.0494103Z  0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5)
2025-12-04T11:11:07.0705687Z  d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b)
2025-12-04T11:11:07.0766428Z  072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master)
2025-12-04T11:11:07.0807750Z  4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e)
2025-12-04T11:11:07.0821217Z  f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1)
2025-12-04T11:11:07.0873556Z  f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated)
2025-12-04T11:11:07.0946254Z  5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8)
2025-12-04T11:11:07.1021094Z  2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main)
2025-12-04T11:11:07.1032268Z ##[group]Cleaning the repository
2025-12-04T11:11:07.1035362Z [command]/usr/bin/git clean -ffdx
2025-12-04T11:11:07.1185191Z [command]/usr/bin/git reset --hard HEAD
2025-12-04T11:11:07.1984880Z HEAD is now at c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452)
2025-12-04T11:11:07.2046028Z ##[endgroup]
2025-12-04T11:11:07.2047120Z ##[group]Disabling automatic garbage collection
2025-12-04T11:11:07.2050293Z [command]/usr/bin/git config --local gc.auto 0
2025-12-04T11:11:07.2081912Z ##[endgroup]
2025-12-04T11:11:07.2082222Z ##[group]Setting up auth
2025-12-04T11:11:07.2085305Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
2025-12-04T11:11:07.2102809Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
2025-12-04T11:11:07.2296132Z Entering 'android/libs/fbjni'
2025-12-04T11:11:07.2323993Z Entering 'third_party/FP16'
2025-12-04T11:11:07.2350250Z Entering 'third_party/FXdiv'
2025-12-04T11:11:07.2376106Z Entering 'third_party/NNPACK'
2025-12-04T11:11:07.2401768Z Entering 'third_party/NVTX'
2025-12-04T11:11:07.2429556Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:07.2454672Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:07.2485632Z Entering 'third_party/aiter'
2025-12-04T11:11:07.2512680Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:07.2540697Z Entering 'third_party/benchmark'
2025-12-04T11:11:07.2566583Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:07.2594628Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:07.2620063Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:07.2644552Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:07.2668923Z Entering 'third_party/cutlass'
2025-12-04T11:11:07.2697018Z Entering 'third_party/fbgemm'
2025-12-04T11:11:07.2724124Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:07.2747501Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:07.2776404Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:07.2800760Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:07.2828759Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:07.2852463Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:07.2883948Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:07.2911954Z Entering 'third_party/flash-attention'
2025-12-04T11:11:07.2939190Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:07.2965805Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:07.2992988Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:07.3018934Z Entering 'third_party/fmt'
2025-12-04T11:11:07.3043574Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:07.3066934Z Entering 'third_party/gloo'
2025-12-04T11:11:07.3091476Z Entering 'third_party/googletest'
2025-12-04T11:11:07.3116957Z Entering 'third_party/ideep'
2025-12-04T11:11:07.3145555Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:07.3171817Z Entering 'third_party/ittapi'
2025-12-04T11:11:07.3196259Z Entering 'third_party/kineto'
2025-12-04T11:11:07.3222080Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:07.3245199Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:07.3268037Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:07.3292008Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:07.3315906Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:07.3341501Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:07.3366193Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:07.3390126Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:07.3413041Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:07.3436626Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:07.3460173Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:07.3487389Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:07.3511814Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:07.3540157Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:07.3563632Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:07.3588778Z Entering 'third_party/kleidiai'
2025-12-04T11:11:07.3613844Z Entering 'third_party/mimalloc'
2025-12-04T11:11:07.3637683Z Entering 'third_party/nlohmann'
2025-12-04T11:11:07.3662340Z Entering 'third_party/onnx'
2025-12-04T11:11:07.3693421Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:07.3719500Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:07.3744378Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:07.3767796Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:07.3792096Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:07.3815241Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:07.3839072Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:07.3861830Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:07.3886155Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:07.3911271Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:07.3936347Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:07.3961795Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:07.3991940Z Entering 'third_party/pocketfft'
2025-12-04T11:11:07.4022942Z Entering 'third_party/protobuf'
2025-12-04T11:11:07.4055825Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:07.4078669Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:07.4109085Z Entering 'third_party/psimd'
2025-12-04T11:11:07.4134268Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:07.4159061Z Entering 'third_party/pybind11'
2025-12-04T11:11:07.4183167Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:07.4206663Z Entering 'third_party/sleef'
2025-12-04T11:11:07.4230190Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:07.4254797Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:07.4277589Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:07.4300603Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:07.4323907Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:07.4348384Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:07.4388111Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
2025-12-04T11:11:07.4407707Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :"
2025-12-04T11:11:07.4556521Z Entering 'android/libs/fbjni'
2025-12-04T11:11:07.4580883Z Entering 'third_party/FP16'
2025-12-04T11:11:07.4604628Z Entering 'third_party/FXdiv'
2025-12-04T11:11:07.4627423Z Entering 'third_party/NNPACK'
2025-12-04T11:11:07.4649936Z Entering 'third_party/NVTX'
2025-12-04T11:11:07.4673294Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:07.4696642Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:07.4726144Z Entering 'third_party/aiter'
2025-12-04T11:11:07.4758558Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:07.4795395Z Entering 'third_party/benchmark'
2025-12-04T11:11:07.4818782Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:07.4846047Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:07.4869012Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:07.4897007Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:07.4920030Z Entering 'third_party/cutlass'
2025-12-04T11:11:07.4948233Z Entering 'third_party/fbgemm'
2025-12-04T11:11:07.4973875Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:07.4996203Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:07.5021782Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:07.5046232Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:07.5073427Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:07.5097246Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:07.5120274Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:07.5145943Z Entering 'third_party/flash-attention'
2025-12-04T11:11:07.5169359Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:07.5193975Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:07.5223181Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:07.5249171Z Entering 'third_party/fmt'
2025-12-04T11:11:07.5272235Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:07.5295539Z Entering 'third_party/gloo'
2025-12-04T11:11:07.5318899Z Entering 'third_party/googletest'
2025-12-04T11:11:07.5341406Z Entering 'third_party/ideep'
2025-12-04T11:11:07.5374300Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:07.5408103Z Entering 'third_party/ittapi'
2025-12-04T11:11:07.5433628Z Entering 'third_party/kineto'
2025-12-04T11:11:07.5457097Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:07.5480541Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:07.5504556Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:07.5528020Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:07.5554153Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:07.5578767Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:07.5609005Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:07.5632758Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:07.5658069Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:07.5685493Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:07.5709703Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:07.5733338Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:07.5757500Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:07.5787140Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:07.5810990Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:07.5836063Z Entering 'third_party/kleidiai'
2025-12-04T11:11:07.5861052Z Entering 'third_party/mimalloc'
2025-12-04T11:11:07.5884571Z Entering 'third_party/nlohmann'
2025-12-04T11:11:07.5910953Z Entering 'third_party/onnx'
2025-12-04T11:11:07.5942110Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:07.5968711Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:07.5992667Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:07.6016375Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:07.6039252Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:07.6067166Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:07.6095446Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:07.6119279Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:07.6143744Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:07.6167666Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:07.6190406Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:07.6216357Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:07.6252924Z Entering 'third_party/pocketfft'
2025-12-04T11:11:07.6277495Z Entering 'third_party/protobuf'
2025-12-04T11:11:07.6302239Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:07.6325255Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:07.6351127Z Entering 'third_party/psimd'
2025-12-04T11:11:07.6376916Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:07.6399712Z Entering 'third_party/pybind11'
2025-12-04T11:11:07.6424977Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:07.6450358Z Entering 'third_party/sleef'
2025-12-04T11:11:07.6476489Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:07.6507890Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:07.6532548Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:07.6558246Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:07.6581660Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:07.6605269Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:07.6645548Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.6667936Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url
2025-12-04T11:11:07.6840892Z Entering 'android/libs/fbjni'
2025-12-04T11:11:07.6852443Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config	remote.origin.url
2025-12-04T11:11:07.6866669Z Entering 'third_party/FP16'
2025-12-04T11:11:07.6881720Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config	remote.origin.url
2025-12-04T11:11:07.6891439Z Entering 'third_party/FXdiv'
2025-12-04T11:11:07.6905243Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config	remote.origin.url
2025-12-04T11:11:07.6916663Z Entering 'third_party/NNPACK'
2025-12-04T11:11:07.6929427Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config	remote.origin.url
2025-12-04T11:11:07.6937456Z Entering 'third_party/NVTX'
2025-12-04T11:11:07.6958944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config	remote.origin.url
2025-12-04T11:11:07.6967620Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:07.6979370Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config	remote.origin.url
2025-12-04T11:11:07.6988933Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:07.7001537Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config	remote.origin.url
2025-12-04T11:11:07.7017196Z Entering 'third_party/aiter'
2025-12-04T11:11:07.7028778Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config	remote.origin.url
2025-12-04T11:11:07.7039036Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:07.7048902Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config	remote.origin.url
2025-12-04T11:11:07.7066819Z Entering 'third_party/benchmark'
2025-12-04T11:11:07.7078411Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:07.7087101Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:07.7098050Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config	remote.origin.url
2025-12-04T11:11:07.7114838Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:07.7125440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config	remote.origin.url
2025-12-04T11:11:07.7135549Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:07.7149238Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config	remote.origin.url
2025-12-04T11:11:07.7159184Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:07.7172323Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config	remote.origin.url
2025-12-04T11:11:07.7182684Z Entering 'third_party/cutlass'
2025-12-04T11:11:07.7195034Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config	remote.origin.url
2025-12-04T11:11:07.7208300Z Entering 'third_party/fbgemm'
2025-12-04T11:11:07.7220417Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config	remote.origin.url
2025-12-04T11:11:07.7231180Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:07.7259142Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config	remote.origin.url
2025-12-04T11:11:07.7282489Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:07.7309404Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config	remote.origin.url
2025-12-04T11:11:07.7337672Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:07.7354918Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config	remote.origin.url
2025-12-04T11:11:07.7383252Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:07.7402111Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config	remote.origin.url
2025-12-04T11:11:07.7426539Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:07.7445488Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config	remote.origin.url
2025-12-04T11:11:07.7455393Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:07.7472290Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config	remote.origin.url
2025-12-04T11:11:07.7486025Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:07.7500449Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config	remote.origin.url
2025-12-04T11:11:07.7519875Z Entering 'third_party/flash-attention'
2025-12-04T11:11:07.7535546Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config	remote.origin.url
2025-12-04T11:11:07.7549276Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:07.7563747Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config	remote.origin.url
2025-12-04T11:11:07.7581462Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:07.7594814Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config	remote.origin.url
2025-12-04T11:11:07.7616470Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:07.7631003Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config	remote.origin.url
2025-12-04T11:11:07.7647337Z Entering 'third_party/fmt'
2025-12-04T11:11:07.7661268Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:07.7675962Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:07.7694189Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config	remote.origin.url
2025-12-04T11:11:07.7707677Z Entering 'third_party/gloo'
2025-12-04T11:11:07.7725662Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config	remote.origin.url
2025-12-04T11:11:07.7739172Z Entering 'third_party/googletest'
2025-12-04T11:11:07.7756128Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:07.7768449Z Entering 'third_party/ideep'
2025-12-04T11:11:07.7784905Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config	remote.origin.url
2025-12-04T11:11:07.7796485Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:07.7809820Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config	remote.origin.url
2025-12-04T11:11:07.7827694Z Entering 'third_party/ittapi'
2025-12-04T11:11:07.7842419Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config	remote.origin.url
2025-12-04T11:11:07.7853907Z Entering 'third_party/kineto'
2025-12-04T11:11:07.7867077Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config	remote.origin.url
2025-12-04T11:11:07.7880685Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:07.7897557Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config	remote.origin.url
2025-12-04T11:11:07.7910039Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:07.7926949Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config	remote.origin.url
2025-12-04T11:11:07.7940043Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:07.7953148Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config	remote.origin.url
2025-12-04T11:11:07.7965453Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:07.7980764Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:07.7992437Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:07.8005763Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config	remote.origin.url
2025-12-04T11:11:07.8018129Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:07.8032733Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config	remote.origin.url
2025-12-04T11:11:07.8047487Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:07.8060884Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config	remote.origin.url
2025-12-04T11:11:07.8073946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:07.8087241Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:07.8098355Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:07.8111623Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config	remote.origin.url
2025-12-04T11:11:07.8123752Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:07.8136731Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config	remote.origin.url
2025-12-04T11:11:07.8147981Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:07.8160931Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:07.8173746Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:07.8192273Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:07.8204405Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:07.8218101Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:07.8235611Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:07.8250697Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:07.8263314Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:07.8276845Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:07.8295387Z Entering 'third_party/kleidiai'
2025-12-04T11:11:07.8311254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config	remote.origin.url
2025-12-04T11:11:07.8326097Z Entering 'third_party/mimalloc'
2025-12-04T11:11:07.8341937Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config	remote.origin.url
2025-12-04T11:11:07.8355783Z Entering 'third_party/nlohmann'
2025-12-04T11:11:07.8370031Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config	remote.origin.url
2025-12-04T11:11:07.8384029Z Entering 'third_party/onnx'
2025-12-04T11:11:07.8401463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config	remote.origin.url
2025-12-04T11:11:07.8423275Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:07.8438896Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:07.8456843Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:07.8472161Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config	remote.origin.url
2025-12-04T11:11:07.8488434Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:07.8500997Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:07.8513343Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:07.8527066Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:07.8539202Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:07.8552920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config	remote.origin.url
2025-12-04T11:11:07.8565572Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:07.8577917Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config	remote.origin.url
2025-12-04T11:11:07.8591545Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:07.8605929Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config	remote.origin.url
2025-12-04T11:11:07.8616354Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:07.8631005Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config	remote.origin.url
2025-12-04T11:11:07.8642963Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:07.8657050Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:07.8669630Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:07.8684357Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:07.8697071Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:07.8711919Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:07.8733834Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:07.8747538Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config	remote.origin.url
2025-12-04T11:11:07.8773443Z Entering 'third_party/pocketfft'
2025-12-04T11:11:07.8790364Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config	remote.origin.url
2025-12-04T11:11:07.8805312Z Entering 'third_party/protobuf'
2025-12-04T11:11:07.8826515Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config	remote.origin.url
2025-12-04T11:11:07.8841619Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:07.8858745Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:07.8872071Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:07.8885654Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:07.8904840Z Entering 'third_party/psimd'
2025-12-04T11:11:07.8921665Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config	remote.origin.url
2025-12-04T11:11:07.8938136Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:07.8953395Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config	remote.origin.url
2025-12-04T11:11:07.8968696Z Entering 'third_party/pybind11'
2025-12-04T11:11:07.8986306Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:07.9000792Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:07.9017772Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config	remote.origin.url
2025-12-04T11:11:07.9033794Z Entering 'third_party/sleef'
2025-12-04T11:11:07.9050772Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config	remote.origin.url
2025-12-04T11:11:07.9065287Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:07.9082920Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config	remote.origin.url
2025-12-04T11:11:07.9095728Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:07.9111158Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:07.9123807Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:07.9136657Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config	remote.origin.url
2025-12-04T11:11:07.9151344Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:07.9163977Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config	remote.origin.url
2025-12-04T11:11:07.9174025Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:07.9183550Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:07.9191304Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:07.9201221Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config	remote.origin.url
2025-12-04T11:11:07.9228143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9251858Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9272067Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9287848Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9304266Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9323553Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9338660Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9352314Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9368028Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9382011Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9396349Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9410174Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9427755Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9441846Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9457138Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9472666Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9487816Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9502740Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9516814Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9530473Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9544145Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9557966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9572032Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9586178Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9599475Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9612392Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9628227Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9642051Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9656634Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9670664Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9684779Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9698932Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9713312Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9726851Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9743085Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9757732Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9774055Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9795351Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9809383Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9822339Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9836555Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9850432Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9873176Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9888963Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9917131Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9936920Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9963365Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:07.9990289Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0018757Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0034222Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0052486Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0068205Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0084684Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0101020Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0121607Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0137697Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0153811Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0170535Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0186183Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0201664Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0216235Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0229748Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0245212Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0259445Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0274755Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0287940Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0301893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0316419Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0330604Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0345534Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0359613Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0372471Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0396944Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0420539Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0435937Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0450125Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0464403Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0481707Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0490076Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0510559Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0522739Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:08.0546181Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic ***
2025-12-04T11:11:08.0579696Z ##[endgroup]
2025-12-04T11:11:08.0579986Z ##[group]Fetching the repository
2025-12-04T11:11:08.0583543Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/*
2025-12-04T11:11:09.5880624Z From https://github.com/pytorch/pytorch
2025-12-04T11:11:09.5881189Z  * [new branch]                2.6.0.dev20241004+      -> origin/2.6.0.dev20241004+
2025-12-04T11:11:09.5881722Z  * [new branch]                2.9.1                   -> origin/2.9.1
2025-12-04T11:11:09.5882326Z  * [new branch]                AaronWang04_addmmfusion_perftest -> origin/AaronWang04_addmmfusion_perftest
2025-12-04T11:11:09.5882954Z  * [new branch]                Flamefire-patch-1       -> origin/Flamefire-patch-1
2025-12-04T11:11:09.5883549Z  * [new branch]                HDCharles-2.6.0-release-notes -> origin/HDCharles-2.6.0-release-notes
2025-12-04T11:11:09.5884121Z  * [new branch]                HOPrintFunc             -> origin/HOPrintFunc
2025-12-04T11:11:09.5884628Z  * [new branch]                IvanKobzarev/stack/1    -> origin/IvanKobzarev/stack/1
2025-12-04T11:11:09.5885144Z  * [new branch]                NicoshevSVE128          -> origin/NicoshevSVE128
2025-12-04T11:11:09.5885667Z  * [new branch]                PR-AOTInductorNoneBug   -> origin/PR-AOTInductorNoneBug
2025-12-04T11:11:09.5886240Z  * [new branch]                PR-AOTInductorNoneBugFix -> origin/PR-AOTInductorNoneBugFix
2025-12-04T11:11:09.5886816Z  * [new branch]                PR-FixConfigsIssue      -> origin/PR-FixConfigsIssue
2025-12-04T11:11:09.5887348Z  * [new branch]                PR-NoneBugFix-viable    -> origin/PR-NoneBugFix-viable
2025-12-04T11:11:09.5887852Z  * [new branch]                PR-ResetToZero          -> origin/PR-ResetToZero
2025-12-04T11:11:09.5888480Z  * [new branch]                Update-Flash-Packaging  -> origin/Update-Flash-Packaging
2025-12-04T11:11:09.5889004Z  * [new branch]                VLA_exp                 -> origin/VLA_exp
2025-12-04T11:11:09.5889473Z  * [new branch]                activation_bench        -> origin/activation_bench
2025-12-04T11:11:09.5889975Z  * [new branch]                addmm-heuristic         -> origin/addmm-heuristic
2025-12-04T11:11:09.5890476Z  * [new branch]                adi/onednn_aarch64      -> origin/adi/onednn_aarch64
2025-12-04T11:11:09.5890956Z  * [new branch]                adi/test                -> origin/adi/test
2025-12-04T11:11:09.5891292Z  * [new branch]                adi/test_bgemm          -> origin/adi/test_bgemm
2025-12-04T11:11:09.5891466Z  * [new branch]                adi/test_m8g            -> origin/adi/test_m8g
2025-12-04T11:11:09.5891635Z  * [new branch]                adi/test_onednn         -> origin/adi/test_onednn
2025-12-04T11:11:09.5891815Z  * [new branch]                adi/test_onednn_v3.9    -> origin/adi/test_onednn_v3.9
2025-12-04T11:11:09.5892006Z  * [new branch]                adi/test_presve_change  -> origin/adi/test_presve_change
2025-12-04T11:11:09.5892187Z  * [new branch]                adi/test_timm           -> origin/adi/test_timm
2025-12-04T11:11:09.5892911Z  * [new branch]                adi/testpresve_change   -> origin/adi/testpresve_change
2025-12-04T11:11:09.5893110Z  * [new branch]                aditew01/test/vec_bf16  -> origin/aditew01/test/vec_bf16
2025-12-04T11:11:09.5893307Z  * [new branch]                ah-globalfeedback-hook  -> origin/ah-globalfeedback-hook
2025-12-04T11:11:09.5893625Z  * [new branch]                albanD-patch-1          -> origin/albanD-patch-1
2025-12-04T11:11:09.5893815Z  * [new branch]                also-surround-shimh     -> origin/also-surround-shimh
2025-12-04T11:11:09.5894006Z  * [new branch]                angelayi/aot_compile    -> origin/angelayi/aot_compile
2025-12-04T11:11:09.5894236Z  * [new branch]                angelayi/aoti_additional_files -> origin/angelayi/aoti_additional_files
2025-12-04T11:11:09.5894453Z  * [new branch]                angelayi/benchmark      -> origin/angelayi/benchmark
2025-12-04T11:11:09.5894678Z  * [new branch]                angelayi/change_pytree_serialization -> origin/angelayi/change_pytree_serialization
2025-12-04T11:11:09.5894916Z  * [new branch]                angelayi/cpp_loader     -> origin/angelayi/cpp_loader
2025-12-04T11:11:09.5895111Z  * [new branch]                angelayi/inductor_const -> origin/angelayi/inductor_const
2025-12-04T11:11:09.5895299Z  * [new branch]                angelayi/lstm           -> origin/angelayi/lstm
2025-12-04T11:11:09.5895481Z  * [new branch]                angelayi/no_so_weight   -> origin/angelayi/no_so_weight
2025-12-04T11:11:09.5895671Z  * [new branch]                angelayi/scan_layers    -> origin/angelayi/scan_layers
2025-12-04T11:11:09.5895853Z  * [new branch]                angelayi/side_eff       -> origin/angelayi/side_eff
2025-12-04T11:11:09.5896036Z  * [new branch]                angelayi/state_dict     -> origin/angelayi/state_dict
2025-12-04T11:11:09.5896231Z  * [new branch]                angelayi/symint_input   -> origin/angelayi/symint_input
2025-12-04T11:11:09.5896416Z  * [new branch]                angelayi/symm_mem       -> origin/angelayi/symm_mem
2025-12-04T11:11:09.5896597Z  * [new branch]                angelayi/test_cpp       -> origin/angelayi/test_cpp
2025-12-04T11:11:09.5896780Z  * [new branch]                angelayi/torch_size     -> origin/angelayi/torch_size
2025-12-04T11:11:09.5896965Z  * [new branch]                annotate_assert         -> origin/annotate_assert
2025-12-04T11:11:09.5897162Z  * [new branch]                annotate_fallback_kernel -> origin/annotate_fallback_kernel
2025-12-04T11:11:09.5897364Z  * [new branch]                annotation_deepcopy     -> origin/annotation_deepcopy
2025-12-04T11:11:09.5897563Z  * [new branch]                annotation_dynamo       -> origin/annotation_dynamo
2025-12-04T11:11:09.5897755Z  * [new branch]                aot_eager_stack_trace   -> origin/aot_eager_stack_trace
2025-12-04T11:11:09.5897941Z  * [new branch]                aoti-cuda-alloc         -> origin/aoti-cuda-alloc
2025-12-04T11:11:09.5898124Z  * [new branch]                aoti_const_device       -> origin/aoti_const_device
2025-12-04T11:11:09.5898369Z  * [new branch]                aoti_fqn_name_interface -> origin/aoti_fqn_name_interface
2025-12-04T11:11:09.5898586Z  * [new branch]                aoti_package_weights_binary -> origin/aoti_package_weights_binary
2025-12-04T11:11:09.5898799Z  * [new branch]                aoti_target_windows     -> origin/aoti_target_windows
2025-12-04T11:11:09.5899029Z  * [new branch]                arsh/feat/inductor_check_profiling -> origin/arsh/feat/inductor_check_profiling
2025-12-04T11:11:09.5899249Z  * [new branch]                async_tp                -> origin/async_tp
2025-12-04T11:11:09.5899459Z  * [new branch]                atalman-inductor-perf-cu124 -> origin/atalman-inductor-perf-cu124
2025-12-04T11:11:09.5899723Z  * [new branch]                atalman-inductor-perf-cu124.1 -> origin/atalman-inductor-perf-cu124.1
2025-12-04T11:11:09.5899946Z  * [new branch]                atalman-patch-2         -> origin/atalman-patch-2
2025-12-04T11:11:09.5900131Z  * [new branch]                atalman-patch-3         -> origin/atalman-patch-3
2025-12-04T11:11:09.5900362Z  * [new branch]                atalman-patch-4         -> origin/atalman-patch-4
2025-12-04T11:11:09.5900544Z  * [new branch]                atalman-patch-5         -> origin/atalman-patch-5
2025-12-04T11:11:09.5900837Z  * [new branch]                atalman-patch-6         -> origin/atalman-patch-6
2025-12-04T11:11:09.5901019Z  * [new branch]                atalman-patch-7         -> origin/atalman-patch-7
2025-12-04T11:11:09.5902085Z  * [new branch]                atalman-patch-8         -> origin/atalman-patch-8
2025-12-04T11:11:09.5902377Z  * [new branch]                atalman_inductor_2.3.1  -> origin/atalman_inductor_2.3.1
2025-12-04T11:11:09.5902612Z  * [new branch]                atalman_inductor_2.4.0  -> origin/atalman_inductor_2.4.0
2025-12-04T11:11:09.5902810Z  * [new branch]                atalman_inductor_2.4.x  -> origin/atalman_inductor_2.4.x
2025-12-04T11:11:09.5903086Z  * [new branch]                attention_benchmarking_clean -> origin/attention_benchmarking_clean
2025-12-04T11:11:09.5903318Z  * [new branch]                bahuang/dt_fix_scalar_add -> origin/bahuang/dt_fix_scalar_add
2025-12-04T11:11:09.5903530Z  * [new branch]                bahuang/fix_debug_mode  -> origin/bahuang/fix_debug_mode
2025-12-04T11:11:09.5903740Z  * [new branch]                bahuang/fix_expand      -> origin/bahuang/fix_expand
2025-12-04T11:11:09.5903925Z  * [new branch]                bahuang/test            -> origin/bahuang/test
2025-12-04T11:11:09.5904099Z  * [new branch]                base/1.5                -> origin/base/1.5
2025-12-04T11:11:09.5904311Z  * [new branch]                batching_sdpa_efficient_attention -> origin/batching_sdpa_efficient_attention
2025-12-04T11:11:09.5904544Z  * [new branch]                bench_scaled_mm_ops     -> origin/bench_scaled_mm_ops
2025-12-04T11:11:09.5904748Z  * [new branch]                benchmark-updates       -> origin/benchmark-updates
2025-12-04T11:11:09.5904954Z  * [new branch]                benchmarking-script     -> origin/benchmarking-script
2025-12-04T11:11:09.5905154Z  * [new branch]                bertmaher/pinbump26     -> origin/bertmaher/pinbump26
2025-12-04T11:11:09.5905343Z  * [new branch]                bertrand/cutlass        -> origin/bertrand/cutlass
2025-12-04T11:11:09.5905537Z  * [new branch]                bf/bug-static-input     -> origin/bf/bug-static-input
2025-12-04T11:11:09.5905731Z  * [new branch]                bf/cg-backend           -> origin/bf/cg-backend
2025-12-04T11:11:09.5905915Z  * [new branch]                bf/cg-nccl-test         -> origin/bf/cg-nccl-test
2025-12-04T11:11:09.5906096Z  * [new branch]                bf/cg-remove-check      -> origin/bf/cg-remove-check
2025-12-04T11:11:09.5906298Z  * [new branch]                bf/clean-torchbench-hf  -> origin/bf/clean-torchbench-hf
2025-12-04T11:11:09.5906500Z  * [new branch]                bf/combo-debug-log      -> origin/bf/combo-debug-log
2025-12-04T11:11:09.5906690Z  * [new branch]                bf/cudagraph            -> origin/bf/cudagraph
2025-12-04T11:11:09.5906933Z  * [new branch]                bf/cudagraph-disable-input-mutation -> origin/bf/cudagraph-disable-input-mutation
2025-12-04T11:11:09.5907296Z  * [new branch]                bf/cudagraph-enable-input-mutation-support-benchmark -> origin/bf/cudagraph-enable-input-mutation-support-benchmark
2025-12-04T11:11:09.5907618Z  * [new branch]                bf/cudagraph-partition  -> origin/bf/cudagraph-partition
2025-12-04T11:11:09.5907905Z  * [new branch]                bf/donated-buffer-bench -> origin/bf/donated-buffer-bench
2025-12-04T11:11:09.5908110Z  * [new branch]                bf/dynamo-partition     -> origin/bf/dynamo-partition
2025-12-04T11:11:09.5908328Z  * [new branch]                bf/lite                 -> origin/bf/lite
2025-12-04T11:11:09.5908516Z  * [new branch]                bf/pa-non-divisible     -> origin/bf/pa-non-divisible
2025-12-04T11:11:09.5908750Z  * [new branch]                bf/partition-cache-free-symbols -> origin/bf/partition-cache-free-symbols
2025-12-04T11:11:09.5909356Z  * [new branch]                bf/partition-memory-plan -> origin/bf/partition-memory-plan
2025-12-04T11:11:09.5909572Z  * [new branch]                bf/partition-move-cpu   -> origin/bf/partition-move-cpu
2025-12-04T11:11:09.5909928Z  * [new branch]                bf/partition-view-fallback -> origin/bf/partition-view-fallback
2025-12-04T11:11:09.5910147Z  * [new branch]                bf/remove-check-55b0c39d -> origin/bf/remove-check-55b0c39d
2025-12-04T11:11:09.5910358Z  * [new branch]                bf/timm-nov-26-2025     -> origin/bf/timm-nov-26-2025
2025-12-04T11:11:09.5910572Z  * [new branch]                bf/transformer-pin-4-57-3 -> origin/bf/transformer-pin-4-57-3
2025-12-04T11:11:09.5910798Z  * [new branch]                bisect_perf_hf_T5_3acc6eac492 -> origin/bisect_perf_hf_T5_3acc6eac492
2025-12-04T11:11:09.5911026Z  * [new branch]                bisect_perf_hf_T5_3fcf66f61fb -> origin/bisect_perf_hf_T5_3fcf66f61fb
2025-12-04T11:11:09.5911251Z  * [new branch]                bisect_perf_hf_T5_4009d154129 -> origin/bisect_perf_hf_T5_4009d154129
2025-12-04T11:11:09.5911464Z  * [new branch]                bisect_perf_hf_T5_40d0740e73d -> origin/bisect_perf_hf_T5_40d0740e73d
2025-12-04T11:11:09.5911683Z  * [new branch]                bisect_perf_hf_T5_5268754e -> origin/bisect_perf_hf_T5_5268754e
2025-12-04T11:11:09.5911897Z  * [new branch]                bisect_perf_hf_T5_7d89a8d385c -> origin/bisect_perf_hf_T5_7d89a8d385c
2025-12-04T11:11:09.5912121Z  * [new branch]                bisect_perf_hf_T5_b7a25c1ee7c -> origin/bisect_perf_hf_T5_b7a25c1ee7c
2025-12-04T11:11:09.5912340Z  * [new branch]                bisect_perf_hf_T5_c25b201583f -> origin/bisect_perf_hf_T5_c25b201583f
2025-12-04T11:11:09.5912555Z  * [new branch]                bisect_perf_hf_T5_c93e57efac0 -> origin/bisect_perf_hf_T5_c93e57efac0
2025-12-04T11:11:09.5912767Z  * [new branch]                bisect_perf_hf_T5_ca9813ea149 -> origin/bisect_perf_hf_T5_ca9813ea149
2025-12-04T11:11:09.5912984Z  * [new branch]                bisect_perf_hf_T5_d65f194a -> origin/bisect_perf_hf_T5_d65f194a
2025-12-04T11:11:09.5913187Z  * [new branch]                bisect_perf_hf_T5_da94ab0b -> origin/bisect_perf_hf_T5_da94ab0b
2025-12-04T11:11:09.5913409Z  * [new branch]                bisect_perf_hf_T5_da94ab0b_new -> origin/bisect_perf_hf_T5_da94ab0b_new
2025-12-04T11:11:09.5913632Z  * [new branch]                bisect_perf_hf_T5_db4e8a1d8a8 -> origin/bisect_perf_hf_T5_db4e8a1d8a8
2025-12-04T11:11:09.5913856Z  * [new branch]                bisect_perf_hf_T5_e0d97e936a2 -> origin/bisect_perf_hf_T5_e0d97e936a2
2025-12-04T11:11:09.5914067Z  * [new branch]                bisect_perf_hf_T5_f23621ec563 -> origin/bisect_perf_hf_T5_f23621ec563
2025-12-04T11:11:09.5914275Z  * [new branch]                brister/fx_device_type  -> origin/brister/fx_device_type
2025-12-04T11:11:09.5914492Z  * [new branch]                brister/test_inductor_all_fx -> origin/brister/test_inductor_all_fx
2025-12-04T11:11:09.5914747Z  * [new branch]                brister/tiled_reduction_no_numel_check -> origin/brister/tiled_reduction_no_numel_check
2025-12-04T11:11:09.5914978Z  * [new branch]                bwd-backup              -> origin/bwd-backup
2025-12-04T11:11:09.5915152Z  * [new branch]                c57382a49               -> origin/c57382a49
2025-12-04T11:11:09.5915322Z  * [new branch]                ca_0431d47eaa           -> origin/ca_0431d47eaa
2025-12-04T11:11:09.5915498Z  * [new branch]                ca_fix_0431d47eaa       -> origin/ca_fix_0431d47eaa
2025-12-04T11:11:09.5915701Z  * [new branch]                camyllh/test_setup_hooks_push -> origin/camyllh/test_setup_hooks_push
2025-12-04T11:11:09.5915921Z  * [new branch]                cccclai-patch-1         -> origin/cccclai-patch-1
2025-12-04T11:11:09.5916208Z  * [new branch]                cherry-pick-159969-by-pytorch_bot_bot_ -> origin/cherry-pick-159969-by-pytorch_bot_bot_
2025-12-04T11:11:09.5916518Z  * [new branch]                cherry-pick-160586-by-pytorch_bot_bot_ -> origin/cherry-pick-160586-by-pytorch_bot_bot_
2025-12-04T11:11:09.5916794Z  * [new branch]                cherry-pick-162208-by-pytorch_bot_bot_ -> origin/cherry-pick-162208-by-pytorch_bot_bot_
2025-12-04T11:11:09.5917150Z  * [new branch]                cherry-pick-163169-by-pytorch_bot_bot_ -> origin/cherry-pick-163169-by-pytorch_bot_bot_
2025-12-04T11:11:09.5917428Z  * [new branch]                cherry-pick-165086-by-pytorch_bot_bot_ -> origin/cherry-pick-165086-by-pytorch_bot_bot_
2025-12-04T11:11:09.5917700Z  * [new branch]                cherry-pick-165514-by-pytorch_bot_bot_ -> origin/cherry-pick-165514-by-pytorch_bot_bot_
2025-12-04T11:11:09.5917976Z  * [new branch]                cherry-pick-165601-by-pytorch_bot_bot_ -> origin/cherry-pick-165601-by-pytorch_bot_bot_
2025-12-04T11:11:09.5918297Z  * [new branch]                cherry-pick-165667-by-pytorch_bot_bot_ -> origin/cherry-pick-165667-by-pytorch_bot_bot_
2025-12-04T11:11:09.5918574Z  * [new branch]                cherry-pick-165815-by-pytorch_bot_bot_ -> origin/cherry-pick-165815-by-pytorch_bot_bot_
2025-12-04T11:11:09.5918854Z  * [new branch]                cherry-pick-165922-by-pytorch_bot_bot_ -> origin/cherry-pick-165922-by-pytorch_bot_bot_
2025-12-04T11:11:09.5919135Z  * [new branch]                cherry-pick-166148-by-pytorch_bot_bot_ -> origin/cherry-pick-166148-by-pytorch_bot_bot_
2025-12-04T11:11:09.5919415Z  * [new branch]                cherry-pick-166181-by-pytorch_bot_bot_ -> origin/cherry-pick-166181-by-pytorch_bot_bot_
2025-12-04T11:11:09.5919692Z  * [new branch]                cherry-pick-166404-by-pytorch_bot_bot_ -> origin/cherry-pick-166404-by-pytorch_bot_bot_
2025-12-04T11:11:09.5919969Z  * [new branch]                cherry-pick-166427-by-pytorch_bot_bot_ -> origin/cherry-pick-166427-by-pytorch_bot_bot_
2025-12-04T11:11:09.5920241Z  * [new branch]                cherry-pick-166480-by-pytorch_bot_bot_ -> origin/cherry-pick-166480-by-pytorch_bot_bot_
2025-12-04T11:11:09.5920522Z  * [new branch]                cherry-pick-166570-by-pytorch_bot_bot_ -> origin/cherry-pick-166570-by-pytorch_bot_bot_
2025-12-04T11:11:09.5920799Z  * [new branch]                cherry-pick-166993-by-pytorch_bot_bot_ -> origin/cherry-pick-166993-by-pytorch_bot_bot_
2025-12-04T11:11:09.5921110Z  * [new branch]                cherry-pick-167111-by-pytorch_bot_bot_ -> origin/cherry-pick-167111-by-pytorch_bot_bot_
2025-12-04T11:11:09.5921384Z  * [new branch]                cherry-pick-167478-by-pytorch_bot_bot_ -> origin/cherry-pick-167478-by-pytorch_bot_bot_
2025-12-04T11:11:09.5921628Z  * [new branch]                cherry_pick_166036_166040 -> origin/cherry_pick_166036_166040
2025-12-04T11:11:09.5921825Z  * [new branch]                cherry_pick_166457      -> origin/cherry_pick_166457
2025-12-04T11:11:09.5922009Z  * [new branch]                cherrypick_166338       -> origin/cherrypick_166338
2025-12-04T11:11:09.5922193Z  * [new branch]                cherrypick_166458       -> origin/cherrypick_166458
2025-12-04T11:11:09.5922378Z  * [new branch]                cherrypick_166586       -> origin/cherrypick_166586
2025-12-04T11:11:09.5922557Z  * [new branch]                cherrypick_166956       -> origin/cherrypick_166956
2025-12-04T11:11:09.5922733Z  * [new branch]                ci_attn                 -> origin/ci_attn
2025-12-04T11:11:09.5922909Z  * [new branch]                codex-testing           -> origin/codex-testing
2025-12-04T11:11:09.5923288Z  * [new branch]                codex/add-check_memory_overlap-helper-functions -> origin/codex/add-check_memory_overlap-helper-functions
2025-12-04T11:11:09.5923597Z  * [new branch]                codex/fix-issue-121219-in-pytorch -> origin/codex/fix-issue-121219-in-pytorch
2025-12-04T11:11:09.5923923Z  * [new branch]                codex/investigate-segfaults-in-get_tensor_storage_id -> origin/codex/investigate-segfaults-in-get_tensor_storage_id
2025-12-04T11:11:09.5924294Z  * [new branch]                codex/refactor-lintrunner-config-to-use-uv-run -> origin/codex/refactor-lintrunner-config-to-use-uv-run
2025-12-04T11:11:09.5924608Z  * [new branch]                compatiblpy39util       -> origin/compatiblpy39util
2025-12-04T11:11:09.5924796Z  * [new branch]                cond_hop_device         -> origin/cond_hop_device
2025-12-04T11:11:09.5925007Z  * [new branch]                context_test            -> origin/context_test
2025-12-04T11:11:09.5925250Z  * [new branch]                copilot/code-style-cleanup-python-pip -> origin/copilot/code-style-cleanup-python-pip
2025-12-04T11:11:09.5925500Z  * [new branch]                cpio/fix_new_ami_tests  -> origin/cpio/fix_new_ami_tests
2025-12-04T11:11:09.5925722Z  * [new branch]                cpp-docs-dependency-upgrade -> origin/cpp-docs-dependency-upgrade
2025-12-04T11:11:09.5926004Z  * [new branch]                crpa/typo-in-inductor_comm_lowering -> origin/crpa/typo-in-inductor_comm_lowering
2025-12-04T11:11:09.5926238Z  * [new branch]                csl/always_produce_xml  -> origin/csl/always_produce_xml
2025-12-04T11:11:09.5926446Z  * [new branch]                csl/build_test_more_procs -> origin/csl/build_test_more_procs
2025-12-04T11:11:09.5926661Z  * [new branch]                csl/build_test_more_procs2 -> origin/csl/build_test_more_procs2
2025-12-04T11:11:09.5926857Z  * [new branch]                csl/clean_up            -> origin/csl/clean_up
2025-12-04T11:11:09.5927051Z  * [new branch]                csl/fix_retry_segfault_exit -> origin/csl/fix_retry_segfault_exit
2025-12-04T11:11:09.5927248Z  * [new branch]                csl/katex               -> origin/csl/katex
2025-12-04T11:11:09.5927430Z  * [new branch]                csl/larger_runner       -> origin/csl/larger_runner
2025-12-04T11:11:09.5927610Z  * [new branch]                csl/lint_testing        -> origin/csl/lint_testing
2025-12-04T11:11:09.5927793Z  * [new branch]                csl/lint_thing          -> origin/csl/lint_thing
2025-12-04T11:11:09.5927981Z  * [new branch]                csl/lintrunner_stuff    -> origin/csl/lintrunner_stuff
2025-12-04T11:11:09.5928230Z  * [new branch]                csl/manually_gen_json   -> origin/csl/manually_gen_json
2025-12-04T11:11:09.5928420Z  * [new branch]                csl/mps_sharding        -> origin/csl/mps_sharding
2025-12-04T11:11:09.5928614Z  * [new branch]                csl/multistage_docker   -> origin/csl/multistage_docker
2025-12-04T11:11:09.5928799Z  * [new branch]                csl/print_timing        -> origin/csl/print_timing
2025-12-04T11:11:09.5928987Z  * [new branch]                csl/remove_experiment   -> origin/csl/remove_experiment
2025-12-04T11:11:09.5929191Z  * [new branch]                csl/remove_maybe_unused_var -> origin/csl/remove_maybe_unused_var
2025-12-04T11:11:09.5929432Z  * [new branch]                csl/remove_repo_specific_autolabel -> origin/csl/remove_repo_specific_autolabel
2025-12-04T11:11:09.5929661Z  * [new branch]                csl/remove_run_parallel -> origin/csl/remove_run_parallel
2025-12-04T11:11:09.5929855Z  * [new branch]                csl/remove_unused_vars  -> origin/csl/remove_unused_vars
2025-12-04T11:11:09.5930050Z  * [new branch]                csl/revert_open         -> origin/csl/revert_open
2025-12-04T11:11:09.5930228Z  * [new branch]                csl/skip_build          -> origin/csl/skip_build
2025-12-04T11:11:09.5930427Z  * [new branch]                csl/smaller_avx_amx_runenrs -> origin/csl/smaller_avx_amx_runenrs
2025-12-04T11:11:09.5930628Z  * [new branch]                csl/td_job_level        -> origin/csl/td_job_level
2025-12-04T11:11:09.5930840Z  * [new branch]                csl/test_cuda_build_large_runner -> origin/csl/test_cuda_build_large_runner
2025-12-04T11:11:09.5931092Z  * [new branch]                csl/test_owners_autograd_dispatch_nn -> origin/csl/test_owners_autograd_dispatch_nn
2025-12-04T11:11:09.5931350Z  * [new branch]                csl/test_owners_higher_confidence -> origin/csl/test_owners_higher_confidence
2025-12-04T11:11:09.5931579Z  * [new branch]                csl/upload_json_running -> origin/csl/upload_json_running
2025-12-04T11:11:09.5931801Z  * [new branch]                csl/win_sccache         -> origin/csl/win_sccache
2025-12-04T11:11:09.5931978Z  * [new branch]                csl/xml_stuff           -> origin/csl/xml_stuff
2025-12-04T11:11:09.5932189Z  * [new branch]                cublasrelax2            -> origin/cublasrelax2
2025-12-04T11:11:09.5932361Z  * [new branch]                cuda_mempool            -> origin/cuda_mempool
2025-12-04T11:11:09.5932548Z  * [new branch]                custom_lowering_dict    -> origin/custom_lowering_dict
2025-12-04T11:11:09.5932753Z  * [new branch]                d4l3k/debug_plane_frtrace -> origin/d4l3k/debug_plane_frtrace
2025-12-04T11:11:09.5932943Z  * [new branch]                daxia6/2.8o3            -> origin/daxia6/2.8o3
2025-12-04T11:11:09.5933120Z  * [new branch]                debug-guard             -> origin/debug-guard
2025-12-04T11:11:09.5933307Z  * [new branch]                delete-quant-docs       -> origin/delete-quant-docs
2025-12-04T11:11:09.5933640Z  * [new branch]                dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.0
2025-12-04T11:11:09.5934099Z  * [new branch]                dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1 -> origin/dependabot/pip/dot-ci/docker/ci_commit_pins/main/transformers-4.57.1
2025-12-04T11:11:09.5934442Z  * [new branch]                desertfire/test_cpp_wrapper -> origin/desertfire/test_cpp_wrapper
2025-12-04T11:11:09.5934685Z  * [new branch]                desertfire/triton-cpu-for-aarch64 -> origin/desertfire/triton-cpu-for-aarch64
2025-12-04T11:11:09.5934924Z  * [new branch]                dev/dhruva/flex_attn_opt -> origin/dev/dhruva/flex_attn_opt
2025-12-04T11:11:09.5935134Z  * [new branch]                dev/joona/MPSNDArrayAdd -> origin/dev/joona/MPSNDArrayAdd
2025-12-04T11:11:09.5935330Z  * [new branch]                dev/joona/Unranked      -> origin/dev/joona/Unranked
2025-12-04T11:11:09.5935535Z  * [new branch]                dev/joona/cat           -> origin/dev/joona/cat
2025-12-04T11:11:09.5935724Z  * [new branch]                dev/joona/embeddingbag  -> origin/dev/joona/embeddingbag
2025-12-04T11:11:09.5935934Z  * [new branch]                dev/joona/fix_sdpa_memtest -> origin/dev/joona/fix_sdpa_memtest
2025-12-04T11:11:09.5936155Z  * [new branch]                dev/joona/getTensorsString -> origin/dev/joona/getTensorsString
2025-12-04T11:11:09.5936383Z  * [new branch]                dev/joona/mps_linear_macos14 -> origin/dev/joona/mps_linear_macos14
2025-12-04T11:11:09.5936592Z  * [new branch]                dev/joona/scalar_clamp  -> origin/dev/joona/scalar_clamp
2025-12-04T11:11:09.5936779Z  * [new branch]                dev/joona/sdpa          -> origin/dev/joona/sdpa
2025-12-04T11:11:09.5936966Z  * [new branch]                dev/joona/sdpa_api      -> origin/dev/joona/sdpa_api
2025-12-04T11:11:09.5937150Z  * [new branch]                dev/joona/type_inf      -> origin/dev/joona/type_inf
2025-12-04T11:11:09.5937356Z  * [new branch]                dev/joona/ulpAssertClose -> origin/dev/joona/ulpAssertClose
2025-12-04T11:11:09.5937556Z  * [new branch]                dev/joona/upsize3d      -> origin/dev/joona/upsize3d
2025-12-04T11:11:09.5937733Z  * [new branch]                disp_counter            -> origin/disp_counter
2025-12-04T11:11:09.5937917Z  * [new branch]                divyanshk-patch-1       -> origin/divyanshk-patch-1
2025-12-04T11:11:09.5938097Z  * [new branch]                docs                    -> origin/docs
2025-12-04T11:11:09.5938298Z  * [new branch]                documentation           -> origin/documentation
2025-12-04T11:11:09.5938486Z  * [new branch]                eager_model_benchmarks  -> origin/eager_model_benchmarks
2025-12-04T11:11:09.5938704Z  * [new branch]                embg/test_inductor_ci_control -> origin/embg/test_inductor_ci_control
2025-12-04T11:11:09.5938950Z  * [new branch]                embg/triton_l2_prefetch_128B -> origin/embg/triton_l2_prefetch_128B
2025-12-04T11:11:09.5939253Z  * [new branch]                embg/triton_l2_prefetch_256B -> origin/embg/triton_l2_prefetch_256B
2025-12-04T11:11:09.5939457Z  * [new branch]                eqy-patch-1             -> origin/eqy-patch-1
2025-12-04T11:11:09.5939658Z  * [new branch]                eqy-patch-2             -> origin/eqy-patch-2
2025-12-04T11:11:09.5939833Z  * [new branch]                eqy-patch-3             -> origin/eqy-patch-3
2025-12-04T11:11:09.5940002Z  * [new branch]                eqy-patch-4             -> origin/eqy-patch-4
2025-12-04T11:11:09.5940167Z  * [new branch]                eqy-patch-5             -> origin/eqy-patch-5
2025-12-04T11:11:09.5940335Z  * [new branch]                eqy-patch-6             -> origin/eqy-patch-6
2025-12-04T11:11:09.5940516Z  * [new branch]                exclamaforte/amd-ma     -> origin/exclamaforte/amd-ma
2025-12-04T11:11:09.5940758Z  * [new branch]                exclamaforte/combo-kernels-perf-run -> origin/exclamaforte/combo-kernels-perf-run
2025-12-04T11:11:09.5941023Z  * [new branch]                exclamaforte/do_bench_refactor -> origin/exclamaforte/do_bench_refactor
2025-12-04T11:11:09.5941274Z  * [new branch]                exclamaforte/enable-mem-dep-fusion -> origin/exclamaforte/enable-mem-dep-fusion
2025-12-04T11:11:09.5941568Z  * [new branch]                exclamaforte/fix-exhaustive-autotuning -> origin/exclamaforte/fix-exhaustive-autotuning
2025-12-04T11:11:09.5941868Z  * [new branch]                exclamaforte/fix-trace-parsing-fx-svg -> origin/exclamaforte/fix-trace-parsing-fx-svg
2025-12-04T11:11:09.5942175Z  * [new branch]                exclamaforte/force-pointwise-cat-perf-run -> origin/exclamaforte/force-pointwise-cat-perf-run
2025-12-04T11:11:09.5942444Z  * [new branch]                exclamaforte/fusion-data -> origin/exclamaforte/fusion-data
2025-12-04T11:11:09.5942682Z  * [new branch]                exclamaforte/gemm-benchmark-run -> origin/exclamaforte/gemm-benchmark-run
2025-12-04T11:11:09.5942937Z  * [new branch]                exclamaforte/gemm-export-model -> origin/exclamaforte/gemm-export-model
2025-12-04T11:11:09.5943165Z  * [new branch]                exclamaforte/gemm-model -> origin/exclamaforte/gemm-model
2025-12-04T11:11:09.5943443Z  * [new branch]                exclamaforte/gemm-model-all-data-collection -> origin/exclamaforte/gemm-model-all-data-collection
2025-12-04T11:11:09.5943712Z  * [new branch]                exclamaforte/gemm-to-amd -> origin/exclamaforte/gemm-to-amd
2025-12-04T11:11:09.5943941Z  * [new branch]                exclamaforte/just-gemm-model -> origin/exclamaforte/just-gemm-model
2025-12-04T11:11:09.5944214Z  * [new branch]                exclamaforte/just-gemm-model-no-refactor -> origin/exclamaforte/just-gemm-model-no-refactor
2025-12-04T11:11:09.5944490Z  * [new branch]                exclamaforte/profile-diff-algo -> origin/exclamaforte/profile-diff-algo
2025-12-04T11:11:09.5944756Z  * [new branch]                exclamaforte/profiler-visualization -> origin/exclamaforte/profiler-visualization
2025-12-04T11:11:09.5945030Z  * [new branch]                exclamaforte/test_cpp_wrapper_mode -> origin/exclamaforte/test_cpp_wrapper_mode
2025-12-04T11:11:09.5945300Z  * [new branch]                exclamaforte/update-autotune-configs -> origin/exclamaforte/update-autotune-configs
2025-12-04T11:11:09.5945596Z  * [new branch]                exclamaforte/update-autotune-configs-2 -> origin/exclamaforte/update-autotune-configs-2
2025-12-04T11:11:09.5945828Z  * [new branch]                exec                    -> origin/exec
2025-12-04T11:11:09.5946005Z  * [new branch]                experimental-mosaic     -> origin/experimental-mosaic
2025-12-04T11:11:09.5946200Z  * [new branch]                export-D61047529        -> origin/export-D61047529
2025-12-04T11:11:09.5946425Z  * [new branch]                export-D71412006        -> origin/export-D71412006
2025-12-04T11:11:09.5946604Z  * [new branch]                export-D73042989        -> origin/export-D73042989
2025-12-04T11:11:09.5946817Z  * [new branch]                export-D78957093        -> origin/export-D78957093
2025-12-04T11:11:09.5946997Z  * [new branch]                export-D78996107        -> origin/export-D78996107
2025-12-04T11:11:09.5947169Z  * [new branch]                export-D80823877        -> origin/export-D80823877
2025-12-04T11:11:09.5947373Z  * [new branch]                export-D80958642        -> origin/export-D80958642
2025-12-04T11:11:09.5947549Z  * [new branch]                export-D81054193        -> origin/export-D81054193
2025-12-04T11:11:09.5947723Z  * [new branch]                export-D81204584        -> origin/export-D81204584
2025-12-04T11:11:09.5947902Z  * [new branch]                export-D81429090        -> origin/export-D81429090
2025-12-04T11:11:09.5948081Z  * [new branch]                export-D82250826        -> origin/export-D82250826
2025-12-04T11:11:09.5948302Z  * [new branch]                export-D82253817        -> origin/export-D82253817
2025-12-04T11:11:09.5948482Z  * [new branch]                export-D83541846        -> origin/export-D83541846
2025-12-04T11:11:09.5948661Z  * [new branch]                export-D83627170        -> origin/export-D83627170
2025-12-04T11:11:09.5948834Z  * [new branch]                export-D83766701        -> origin/export-D83766701
2025-12-04T11:11:09.5949013Z  * [new branch]                export-D83768878        -> origin/export-D83768878
2025-12-04T11:11:09.5949189Z  * [new branch]                export-D83769447        -> origin/export-D83769447
2025-12-04T11:11:09.5949361Z  * [new branch]                export-D84089824        -> origin/export-D84089824
2025-12-04T11:11:09.5949537Z  * [new branch]                export-D84213020        -> origin/export-D84213020
2025-12-04T11:11:09.5949708Z  * [new branch]                export-D84373821        -> origin/export-D84373821
2025-12-04T11:11:09.5949883Z  * [new branch]                export-D84612194        -> origin/export-D84612194
2025-12-04T11:11:09.5950062Z  * [new branch]                export-D84890985        -> origin/export-D84890985
2025-12-04T11:11:09.5950234Z  * [new branch]                export-D85122326        -> origin/export-D85122326
2025-12-04T11:11:09.5950414Z  * [new branch]                export-D86256198        -> origin/export-D86256198
2025-12-04T11:11:09.5950595Z  * [new branch]                export-D86460608        -> origin/export-D86460608
2025-12-04T11:11:09.5950767Z  * [new branch]                export-D86474796        -> origin/export-D86474796
2025-12-04T11:11:09.5950944Z  * [new branch]                export-D86712396        -> origin/export-D86712396
2025-12-04T11:11:09.5951121Z  * [new branch]                export-D87022129        -> origin/export-D87022129
2025-12-04T11:11:09.5951296Z  * [new branch]                export-D87838959        -> origin/export-D87838959
2025-12-04T11:11:09.5951474Z  * [new branch]                export-D88319437        -> origin/export-D88319437
2025-12-04T11:11:09.5951705Z  * [new branch]                exported-model-train-idempotent -> origin/exported-model-train-idempotent
2025-12-04T11:11:09.5951939Z  * [new branch]                ezyang-titan-october    -> origin/ezyang-titan-october
2025-12-04T11:11:09.5952143Z  * [new branch]                ezyang-titan-october2   -> origin/ezyang-titan-october2
2025-12-04T11:11:09.5952336Z  * [new branch]                ezyang-war              -> origin/ezyang-war
2025-12-04T11:11:09.5952535Z  * [new branch]                ezyang/wip-aot-descriptors -> origin/ezyang/wip-aot-descriptors
2025-12-04T11:11:09.5952737Z  * [new branch]                fa_u8_brgemm            -> origin/fa_u8_brgemm
2025-12-04T11:11:09.5952934Z  * [new branch]                fadeputr/sequence_fbgemm -> origin/fadeputr/sequence_fbgemm
2025-12-04T11:11:09.5953128Z  * [new branch]                fastmath_baseline       -> origin/fastmath_baseline
2025-12-04T11:11:09.5953310Z  * [new branch]                fbcode/warm             -> origin/fbcode/warm
2025-12-04T11:11:09.5953501Z  * [new branch]                fca                     -> origin/fca
2025-12-04T11:11:09.5953707Z  * [new branch]                fca2_ca5984c            -> origin/fca2_ca5984c
2025-12-04T11:11:09.5953873Z  * [new branch]                fca5                    -> origin/fca5
2025-12-04T11:11:09.5954087Z  * [new branch]                feature/justknobs-cpp   -> origin/feature/justknobs-cpp
2025-12-04T11:11:09.5954294Z  * [new branch]                feature/numa-forkserver -> origin/feature/numa-forkserver
2025-12-04T11:11:09.5954488Z  * [new branch]                ffast_math_baseline     -> origin/ffast_math_baseline
2025-12-04T11:11:09.5954677Z  * [new branch]                ffast_math_target       -> origin/ffast_math_target
2025-12-04T11:11:09.5954867Z  * [new branch]                findhao/base_commit     -> origin/findhao/base_commit
2025-12-04T11:11:09.5955099Z  * [new branch]                findhao/base_commit1    -> origin/findhao/base_commit1
2025-12-04T11:11:09.5955296Z  * [new branch]                findhao/multistream2    -> origin/findhao/multistream2
2025-12-04T11:11:09.5955496Z  * [new branch]                findhao/multistream5    -> origin/findhao/multistream5
2025-12-04T11:11:09.5955685Z  * [new branch]                findhao/multistream6    -> origin/findhao/multistream6
2025-12-04T11:11:09.5955887Z  * [new branch]                findhao/operatorbench3  -> origin/findhao/operatorbench3
2025-12-04T11:11:09.5956089Z  * [new branch]                findhao/operatorbench5  -> origin/findhao/operatorbench5
2025-12-04T11:11:09.5956288Z  * [new branch]                findhao/tritonparse     -> origin/findhao/tritonparse
2025-12-04T11:11:09.5956507Z  * [new branch]                fix-ck-gemm-template-format -> origin/fix-ck-gemm-template-format
2025-12-04T11:11:09.5956720Z  * [new branch]                fix-config-ignore       -> origin/fix-config-ignore
2025-12-04T11:11:09.5956908Z  * [new branch]                fix-dict-guard          -> origin/fix-dict-guard
2025-12-04T11:11:09.5957088Z  * [new branch]                fix_addmm_issue         -> origin/fix_addmm_issue
2025-12-04T11:11:09.5957291Z  * [new branch]                fix_amd_missing_cluster_dims -> origin/fix_amd_missing_cluster_dims
2025-12-04T11:11:09.5957499Z  * [new branch]                fix_bench_bwd_pass      -> origin/fix_bench_bwd_pass
2025-12-04T11:11:09.5957698Z  * [new branch]                fix_mem_profiler_config -> origin/fix_mem_profiler_config
2025-12-04T11:11:09.5957885Z  * [new branch]                fix_nvrtc_discovery     -> origin/fix_nvrtc_discovery
2025-12-04T11:11:09.5958066Z  * [new branch]                fix_op_runner           -> origin/fix_op_runner
2025-12-04T11:11:09.5958280Z  * [new branch]                fix_ubn_159469          -> origin/fix_ubn_159469
2025-12-04T11:11:09.5958451Z  * [new branch]                fixes-triage            -> origin/fixes-triage
2025-12-04T11:11:09.5958631Z  * [new branch]                fixflashinfer           -> origin/fixflashinfer
2025-12-04T11:11:09.5958816Z  * [new branch]                flash_decoding_cpu      -> origin/flash_decoding_cpu
2025-12-04T11:11:09.5958996Z  * [new branch]                flex-flash              -> origin/flex-flash
2025-12-04T11:11:09.5959204Z  * [new branch]                flex_attention_functorch_grad -> origin/flex_attention_functorch_grad
2025-12-04T11:11:09.5959413Z  * [new branch]                flex_flash              -> origin/flex_flash
2025-12-04T11:11:09.5959617Z  * [new branch]                fmassa/fix_memeff_sharding_rule -> origin/fmassa/fix_memeff_sharding_rule
2025-12-04T11:11:09.5959869Z  * [new branch]                fmassa/tests_comm_compute_scheduler -> origin/fmassa/tests_comm_compute_scheduler
2025-12-04T11:11:09.5960093Z  * [new branch]                forkserver_fix          -> origin/forkserver_fix
2025-12-04T11:11:09.5960273Z  * [new branch]                fsdp2_trace_rules       -> origin/fsdp2_trace_rules
2025-12-04T11:11:09.5982399Z  * [new branch]                fx_cpp                  -> origin/fx_cpp
2025-12-04T11:11:09.5982639Z  * [new branch]                fy/fix-win              -> origin/fy/fix-win
2025-12-04T11:11:09.5983818Z  * [new branch]                galv-patch-1            -> origin/galv-patch-1
2025-12-04T11:11:09.5984064Z  * [new branch]                galv/cudagraphs-conditional-nodes-4 -> origin/galv/cudagraphs-conditional-nodes-4
2025-12-04T11:11:09.5984364Z  * [new branch]                georgehong/cmakelists-patch -> origin/georgehong/cmakelists-patch
2025-12-04T11:11:09.5984589Z  * [new branch]                gh/AlnisM/1/base        -> origin/gh/AlnisM/1/base
2025-12-04T11:11:09.5984782Z  * [new branch]                gh/AlnisM/1/head        -> origin/gh/AlnisM/1/head
2025-12-04T11:11:09.5984974Z  * [new branch]                gh/EikanWang/67/base    -> origin/gh/EikanWang/67/base
2025-12-04T11:11:09.5985177Z  * [new branch]                gh/EikanWang/67/head    -> origin/gh/EikanWang/67/head
2025-12-04T11:11:09.5985380Z  * [new branch]                gh/Gasoonjia/1/base     -> origin/gh/Gasoonjia/1/base
2025-12-04T11:11:09.5985577Z  * [new branch]                gh/Gasoonjia/1/head     -> origin/gh/Gasoonjia/1/head
2025-12-04T11:11:09.5985774Z  * [new branch]                gh/H-Huang/131/base     -> origin/gh/H-Huang/131/base
2025-12-04T11:11:09.5985967Z  * [new branch]                gh/H-Huang/131/head     -> origin/gh/H-Huang/131/head
2025-12-04T11:11:09.5986157Z  * [new branch]                gh/H-Huang/131/orig     -> origin/gh/H-Huang/131/orig
2025-12-04T11:11:09.5986348Z  * [new branch]                gh/H-Huang/132/base     -> origin/gh/H-Huang/132/base
2025-12-04T11:11:09.5986539Z  * [new branch]                gh/H-Huang/132/head     -> origin/gh/H-Huang/132/head
2025-12-04T11:11:09.5986722Z  * [new branch]                gh/H-Huang/132/orig     -> origin/gh/H-Huang/132/orig
2025-12-04T11:11:09.5986912Z  * [new branch]                gh/H-Huang/180/base     -> origin/gh/H-Huang/180/base
2025-12-04T11:11:09.5987095Z  * [new branch]                gh/H-Huang/180/head     -> origin/gh/H-Huang/180/head
2025-12-04T11:11:09.5987291Z  * [new branch]                gh/H-Huang/180/orig     -> origin/gh/H-Huang/180/orig
2025-12-04T11:11:09.5987480Z  * [new branch]                gh/H-Huang/182/base     -> origin/gh/H-Huang/182/base
2025-12-04T11:11:09.5987664Z  * [new branch]                gh/H-Huang/182/head     -> origin/gh/H-Huang/182/head
2025-12-04T11:11:09.5987857Z  * [new branch]                gh/H-Huang/182/orig     -> origin/gh/H-Huang/182/orig
2025-12-04T11:11:09.5988046Z  * [new branch]                gh/H-Huang/226/base     -> origin/gh/H-Huang/226/base
2025-12-04T11:11:09.5988273Z  * [new branch]                gh/H-Huang/226/head     -> origin/gh/H-Huang/226/head
2025-12-04T11:11:09.5988461Z  * [new branch]                gh/H-Huang/226/orig     -> origin/gh/H-Huang/226/orig
2025-12-04T11:11:09.5988649Z  * [new branch]                gh/H-Huang/228/base     -> origin/gh/H-Huang/228/base
2025-12-04T11:11:09.5988834Z  * [new branch]                gh/H-Huang/228/head     -> origin/gh/H-Huang/228/head
2025-12-04T11:11:09.5989029Z  * [new branch]                gh/H-Huang/228/orig     -> origin/gh/H-Huang/228/orig
2025-12-04T11:11:09.5989235Z  * [new branch]                gh/IvanKobzarev/150/base -> origin/gh/IvanKobzarev/150/base
2025-12-04T11:11:09.5989446Z  * [new branch]                gh/IvanKobzarev/150/head -> origin/gh/IvanKobzarev/150/head
2025-12-04T11:11:09.5989662Z  * [new branch]                gh/IvanKobzarev/150/orig -> origin/gh/IvanKobzarev/150/orig
2025-12-04T11:11:09.5989874Z  * [new branch]                gh/IvanKobzarev/157/base -> origin/gh/IvanKobzarev/157/base
2025-12-04T11:11:09.5990079Z  * [new branch]                gh/IvanKobzarev/157/head -> origin/gh/IvanKobzarev/157/head
2025-12-04T11:11:09.5990288Z  * [new branch]                gh/IvanKobzarev/157/orig -> origin/gh/IvanKobzarev/157/orig
2025-12-04T11:11:09.5990498Z  * [new branch]                gh/IvanKobzarev/159/base -> origin/gh/IvanKobzarev/159/base
2025-12-04T11:11:09.5990703Z  * [new branch]                gh/IvanKobzarev/159/head -> origin/gh/IvanKobzarev/159/head
2025-12-04T11:11:09.5990953Z  * [new branch]                gh/IvanKobzarev/159/orig -> origin/gh/IvanKobzarev/159/orig
2025-12-04T11:11:09.5991165Z  * [new branch]                gh/IvanKobzarev/162/base -> origin/gh/IvanKobzarev/162/base
2025-12-04T11:11:09.5991406Z  * [new branch]                gh/IvanKobzarev/162/head -> origin/gh/IvanKobzarev/162/head
2025-12-04T11:11:09.5991617Z  * [new branch]                gh/IvanKobzarev/162/orig -> origin/gh/IvanKobzarev/162/orig
2025-12-04T11:11:09.5991827Z  * [new branch]                gh/IvanKobzarev/163/base -> origin/gh/IvanKobzarev/163/base
2025-12-04T11:11:09.5992029Z  * [new branch]                gh/IvanKobzarev/163/head -> origin/gh/IvanKobzarev/163/head
2025-12-04T11:11:09.5992240Z  * [new branch]                gh/IvanKobzarev/163/orig -> origin/gh/IvanKobzarev/163/orig
2025-12-04T11:11:09.5992443Z  * [new branch]                gh/IvanKobzarev/166/base -> origin/gh/IvanKobzarev/166/base
2025-12-04T11:11:09.5992660Z  * [new branch]                gh/IvanKobzarev/166/head -> origin/gh/IvanKobzarev/166/head
2025-12-04T11:11:09.5992870Z  * [new branch]                gh/IvanKobzarev/166/orig -> origin/gh/IvanKobzarev/166/orig
2025-12-04T11:11:09.5993078Z  * [new branch]                gh/IvanKobzarev/167/base -> origin/gh/IvanKobzarev/167/base
2025-12-04T11:11:09.5993291Z  * [new branch]                gh/IvanKobzarev/167/head -> origin/gh/IvanKobzarev/167/head
2025-12-04T11:11:09.5993502Z  * [new branch]                gh/IvanKobzarev/167/orig -> origin/gh/IvanKobzarev/167/orig
2025-12-04T11:11:09.5993706Z  * [new branch]                gh/IvanKobzarev/168/base -> origin/gh/IvanKobzarev/168/base
2025-12-04T11:11:09.5993916Z  * [new branch]                gh/IvanKobzarev/168/head -> origin/gh/IvanKobzarev/168/head
2025-12-04T11:11:09.5994125Z  * [new branch]                gh/IvanKobzarev/168/orig -> origin/gh/IvanKobzarev/168/orig
2025-12-04T11:11:09.5994329Z  * [new branch]                gh/IvanKobzarev/169/base -> origin/gh/IvanKobzarev/169/base
2025-12-04T11:11:09.5994541Z  * [new branch]                gh/IvanKobzarev/169/head -> origin/gh/IvanKobzarev/169/head
2025-12-04T11:11:09.5994750Z  * [new branch]                gh/IvanKobzarev/169/orig -> origin/gh/IvanKobzarev/169/orig
2025-12-04T11:11:09.5994958Z  * [new branch]                gh/IvanKobzarev/170/base -> origin/gh/IvanKobzarev/170/base
2025-12-04T11:11:09.5995167Z  * [new branch]                gh/IvanKobzarev/170/head -> origin/gh/IvanKobzarev/170/head
2025-12-04T11:11:09.5995377Z  * [new branch]                gh/IvanKobzarev/170/orig -> origin/gh/IvanKobzarev/170/orig
2025-12-04T11:11:09.5995580Z  * [new branch]                gh/IvanKobzarev/171/base -> origin/gh/IvanKobzarev/171/base
2025-12-04T11:11:09.5995789Z  * [new branch]                gh/IvanKobzarev/171/head -> origin/gh/IvanKobzarev/171/head
2025-12-04T11:11:09.5995998Z  * [new branch]                gh/IvanKobzarev/171/orig -> origin/gh/IvanKobzarev/171/orig
2025-12-04T11:11:09.5996203Z  * [new branch]                gh/IvanKobzarev/172/base -> origin/gh/IvanKobzarev/172/base
2025-12-04T11:11:09.5996414Z  * [new branch]                gh/IvanKobzarev/172/head -> origin/gh/IvanKobzarev/172/head
2025-12-04T11:11:09.5996624Z  * [new branch]                gh/IvanKobzarev/172/orig -> origin/gh/IvanKobzarev/172/orig
2025-12-04T11:11:09.5996829Z  * [new branch]                gh/IvanKobzarev/173/base -> origin/gh/IvanKobzarev/173/base
2025-12-04T11:11:09.5997036Z  * [new branch]                gh/IvanKobzarev/173/head -> origin/gh/IvanKobzarev/173/head
2025-12-04T11:11:09.5997245Z  * [new branch]                gh/IvanKobzarev/173/orig -> origin/gh/IvanKobzarev/173/orig
2025-12-04T11:11:09.5997447Z  * [new branch]                gh/IvanKobzarev/174/base -> origin/gh/IvanKobzarev/174/base
2025-12-04T11:11:09.5997656Z  * [new branch]                gh/IvanKobzarev/174/head -> origin/gh/IvanKobzarev/174/head
2025-12-04T11:11:09.5997864Z  * [new branch]                gh/IvanKobzarev/174/orig -> origin/gh/IvanKobzarev/174/orig
2025-12-04T11:11:09.5998104Z  * [new branch]                gh/IvanKobzarev/175/base -> origin/gh/IvanKobzarev/175/base
2025-12-04T11:11:09.5998356Z  * [new branch]                gh/IvanKobzarev/175/head -> origin/gh/IvanKobzarev/175/head
2025-12-04T11:11:09.5998603Z  * [new branch]                gh/IvanKobzarev/175/orig -> origin/gh/IvanKobzarev/175/orig
2025-12-04T11:11:09.5998806Z  * [new branch]                gh/IvanKobzarev/176/base -> origin/gh/IvanKobzarev/176/base
2025-12-04T11:11:09.5999014Z  * [new branch]                gh/IvanKobzarev/176/head -> origin/gh/IvanKobzarev/176/head
2025-12-04T11:11:09.5999220Z  * [new branch]                gh/IvanKobzarev/176/orig -> origin/gh/IvanKobzarev/176/orig
2025-12-04T11:11:09.5999430Z  * [new branch]                gh/IvanKobzarev/177/base -> origin/gh/IvanKobzarev/177/base
2025-12-04T11:11:09.5999639Z  * [new branch]                gh/IvanKobzarev/177/head -> origin/gh/IvanKobzarev/177/head
2025-12-04T11:11:09.5999847Z  * [new branch]                gh/IvanKobzarev/177/orig -> origin/gh/IvanKobzarev/177/orig
2025-12-04T11:11:09.6000056Z  * [new branch]                gh/IvanKobzarev/178/base -> origin/gh/IvanKobzarev/178/base
2025-12-04T11:11:09.6000264Z  * [new branch]                gh/IvanKobzarev/178/head -> origin/gh/IvanKobzarev/178/head
2025-12-04T11:11:09.6000469Z  * [new branch]                gh/IvanKobzarev/178/orig -> origin/gh/IvanKobzarev/178/orig
2025-12-04T11:11:09.6000673Z  * [new branch]                gh/IvanKobzarev/179/base -> origin/gh/IvanKobzarev/179/base
2025-12-04T11:11:09.6000879Z  * [new branch]                gh/IvanKobzarev/179/head -> origin/gh/IvanKobzarev/179/head
2025-12-04T11:11:09.6001079Z  * [new branch]                gh/IvanKobzarev/179/orig -> origin/gh/IvanKobzarev/179/orig
2025-12-04T11:11:09.6001282Z  * [new branch]                gh/IvanKobzarev/180/base -> origin/gh/IvanKobzarev/180/base
2025-12-04T11:11:09.6001485Z  * [new branch]                gh/IvanKobzarev/180/head -> origin/gh/IvanKobzarev/180/head
2025-12-04T11:11:09.6001689Z  * [new branch]                gh/IvanKobzarev/180/orig -> origin/gh/IvanKobzarev/180/orig
2025-12-04T11:11:09.6001892Z  * [new branch]                gh/IvanKobzarev/181/base -> origin/gh/IvanKobzarev/181/base
2025-12-04T11:11:09.6002102Z  * [new branch]                gh/IvanKobzarev/181/head -> origin/gh/IvanKobzarev/181/head
2025-12-04T11:11:09.6002301Z  * [new branch]                gh/IvanKobzarev/181/orig -> origin/gh/IvanKobzarev/181/orig
2025-12-04T11:11:09.6002506Z  * [new branch]                gh/IvanKobzarev/182/base -> origin/gh/IvanKobzarev/182/base
2025-12-04T11:11:09.6002710Z  * [new branch]                gh/IvanKobzarev/182/head -> origin/gh/IvanKobzarev/182/head
2025-12-04T11:11:09.6002911Z  * [new branch]                gh/IvanKobzarev/182/orig -> origin/gh/IvanKobzarev/182/orig
2025-12-04T11:11:09.6003115Z  * [new branch]                gh/IvanKobzarev/183/base -> origin/gh/IvanKobzarev/183/base
2025-12-04T11:11:09.6003321Z  * [new branch]                gh/IvanKobzarev/183/head -> origin/gh/IvanKobzarev/183/head
2025-12-04T11:11:09.6003524Z  * [new branch]                gh/IvanKobzarev/183/orig -> origin/gh/IvanKobzarev/183/orig
2025-12-04T11:11:09.6003728Z  * [new branch]                gh/IvanKobzarev/184/base -> origin/gh/IvanKobzarev/184/base
2025-12-04T11:11:09.6003935Z  * [new branch]                gh/IvanKobzarev/184/head -> origin/gh/IvanKobzarev/184/head
2025-12-04T11:11:09.6004135Z  * [new branch]                gh/IvanKobzarev/184/orig -> origin/gh/IvanKobzarev/184/orig
2025-12-04T11:11:09.6004344Z  * [new branch]                gh/NikhilAPatel/1/base  -> origin/gh/NikhilAPatel/1/base
2025-12-04T11:11:09.6004549Z  * [new branch]                gh/NikhilAPatel/1/head  -> origin/gh/NikhilAPatel/1/head
2025-12-04T11:11:09.6004747Z  * [new branch]                gh/NikhilAPatel/2/base  -> origin/gh/NikhilAPatel/2/base
2025-12-04T11:11:09.6004946Z  * [new branch]                gh/NikhilAPatel/2/head  -> origin/gh/NikhilAPatel/2/head
2025-12-04T11:11:09.6005176Z  * [new branch]                gh/NikhilAPatel/4/base  -> origin/gh/NikhilAPatel/4/base
2025-12-04T11:11:09.6005377Z  * [new branch]                gh/NikhilAPatel/4/head  -> origin/gh/NikhilAPatel/4/head
2025-12-04T11:11:09.6005576Z  * [new branch]                gh/NikhilAPatel/5/base  -> origin/gh/NikhilAPatel/5/base
2025-12-04T11:11:09.6005796Z  * [new branch]                gh/NikhilAPatel/5/head  -> origin/gh/NikhilAPatel/5/head
2025-12-04T11:11:09.6005995Z  * [new branch]                gh/NikhilAPatel/5/orig  -> origin/gh/NikhilAPatel/5/orig
2025-12-04T11:11:09.6006190Z  * [new branch]                gh/PaliC/17/base        -> origin/gh/PaliC/17/base
2025-12-04T11:11:09.6006372Z  * [new branch]                gh/PaliC/17/head        -> origin/gh/PaliC/17/head
2025-12-04T11:11:09.6006553Z  * [new branch]                gh/PaliC/17/orig        -> origin/gh/PaliC/17/orig
2025-12-04T11:11:09.6006734Z  * [new branch]                gh/PaliC/18/base        -> origin/gh/PaliC/18/base
2025-12-04T11:11:09.6006914Z  * [new branch]                gh/PaliC/18/head        -> origin/gh/PaliC/18/head
2025-12-04T11:11:09.6007095Z  * [new branch]                gh/PaliC/18/orig        -> origin/gh/PaliC/18/orig
2025-12-04T11:11:09.6007273Z  * [new branch]                gh/PaliC/20/base        -> origin/gh/PaliC/20/base
2025-12-04T11:11:09.6007453Z  * [new branch]                gh/PaliC/20/head        -> origin/gh/PaliC/20/head
2025-12-04T11:11:09.6007633Z  * [new branch]                gh/PaliC/20/orig        -> origin/gh/PaliC/20/orig
2025-12-04T11:11:09.6007811Z  * [new branch]                gh/PaliC/21/base        -> origin/gh/PaliC/21/base
2025-12-04T11:11:09.6007985Z  * [new branch]                gh/PaliC/21/head        -> origin/gh/PaliC/21/head
2025-12-04T11:11:09.6008192Z  * [new branch]                gh/PaliC/21/orig        -> origin/gh/PaliC/21/orig
2025-12-04T11:11:09.6008371Z  * [new branch]                gh/PaliC/23/base        -> origin/gh/PaliC/23/base
2025-12-04T11:11:09.6008549Z  * [new branch]                gh/PaliC/23/head        -> origin/gh/PaliC/23/head
2025-12-04T11:11:09.6008727Z  * [new branch]                gh/PaliC/23/orig        -> origin/gh/PaliC/23/orig
2025-12-04T11:11:09.6008907Z  * [new branch]                gh/PaliC/24/base        -> origin/gh/PaliC/24/base
2025-12-04T11:11:09.6009085Z  * [new branch]                gh/PaliC/24/head        -> origin/gh/PaliC/24/head
2025-12-04T11:11:09.6009264Z  * [new branch]                gh/PaliC/24/orig        -> origin/gh/PaliC/24/orig
2025-12-04T11:11:09.6009439Z  * [new branch]                gh/PaliC/25/head        -> origin/gh/PaliC/25/head
2025-12-04T11:11:09.6009617Z  * [new branch]                gh/PaliC/25/next        -> origin/gh/PaliC/25/next
2025-12-04T11:11:09.6009794Z  * [new branch]                gh/PaliC/25/orig        -> origin/gh/PaliC/25/orig
2025-12-04T11:11:09.6009967Z  * [new branch]                gh/PaliC/26/head        -> origin/gh/PaliC/26/head
2025-12-04T11:11:09.6010146Z  * [new branch]                gh/PaliC/26/next        -> origin/gh/PaliC/26/next
2025-12-04T11:11:09.6010325Z  * [new branch]                gh/PaliC/26/orig        -> origin/gh/PaliC/26/orig
2025-12-04T11:11:09.6010498Z  * [new branch]                gh/PaliC/27/next        -> origin/gh/PaliC/27/next
2025-12-04T11:11:09.6010676Z  * [new branch]                gh/PaliC/28/head        -> origin/gh/PaliC/28/head
2025-12-04T11:11:09.6010854Z  * [new branch]                gh/PaliC/28/next        -> origin/gh/PaliC/28/next
2025-12-04T11:11:09.6011026Z  * [new branch]                gh/PaliC/28/orig        -> origin/gh/PaliC/28/orig
2025-12-04T11:11:09.6011203Z  * [new branch]                gh/PaliC/29/head        -> origin/gh/PaliC/29/head
2025-12-04T11:11:09.6011383Z  * [new branch]                gh/PaliC/29/next        -> origin/gh/PaliC/29/next
2025-12-04T11:11:09.6011557Z  * [new branch]                gh/PaliC/29/orig        -> origin/gh/PaliC/29/orig
2025-12-04T11:11:09.6011735Z  * [new branch]                gh/PaliC/30/head        -> origin/gh/PaliC/30/head
2025-12-04T11:11:09.6011953Z  * [new branch]                gh/PaliC/30/next        -> origin/gh/PaliC/30/next
2025-12-04T11:11:09.6012128Z  * [new branch]                gh/PaliC/30/orig        -> origin/gh/PaliC/30/orig
2025-12-04T11:11:09.6012339Z  * [new branch]                gh/PaliC/31/head        -> origin/gh/PaliC/31/head
2025-12-04T11:11:09.6012513Z  * [new branch]                gh/PaliC/31/next        -> origin/gh/PaliC/31/next
2025-12-04T11:11:09.6012694Z  * [new branch]                gh/PaliC/31/orig        -> origin/gh/PaliC/31/orig
2025-12-04T11:11:09.6012885Z  * [new branch]                gh/PaulZhang12/25/base  -> origin/gh/PaulZhang12/25/base
2025-12-04T11:11:09.6013080Z  * [new branch]                gh/PaulZhang12/25/head  -> origin/gh/PaulZhang12/25/head
2025-12-04T11:11:09.6013279Z  * [new branch]                gh/PaulZhang12/25/orig  -> origin/gh/PaulZhang12/25/orig
2025-12-04T11:11:09.6013478Z  * [new branch]                gh/PaulZhang12/28/base  -> origin/gh/PaulZhang12/28/base
2025-12-04T11:11:09.6013674Z  * [new branch]                gh/PaulZhang12/28/head  -> origin/gh/PaulZhang12/28/head
2025-12-04T11:11:09.6013873Z  * [new branch]                gh/PaulZhang12/28/orig  -> origin/gh/PaulZhang12/28/orig
2025-12-04T11:11:09.6014075Z  * [new branch]                gh/PaulZhang12/31/base  -> origin/gh/PaulZhang12/31/base
2025-12-04T11:11:09.6014266Z  * [new branch]                gh/PaulZhang12/31/head  -> origin/gh/PaulZhang12/31/head
2025-12-04T11:11:09.6014463Z  * [new branch]                gh/PaulZhang12/31/orig  -> origin/gh/PaulZhang12/31/orig
2025-12-04T11:11:09.6014660Z  * [new branch]                gh/PaulZhang12/37/base  -> origin/gh/PaulZhang12/37/base
2025-12-04T11:11:09.6014852Z  * [new branch]                gh/PaulZhang12/37/head  -> origin/gh/PaulZhang12/37/head
2025-12-04T11:11:09.6015047Z  * [new branch]                gh/PaulZhang12/37/orig  -> origin/gh/PaulZhang12/37/orig
2025-12-04T11:11:09.6015243Z  * [new branch]                gh/PaulZhang12/40/base  -> origin/gh/PaulZhang12/40/base
2025-12-04T11:11:09.6015440Z  * [new branch]                gh/PaulZhang12/40/head  -> origin/gh/PaulZhang12/40/head
2025-12-04T11:11:09.6015636Z  * [new branch]                gh/PaulZhang12/40/orig  -> origin/gh/PaulZhang12/40/orig
2025-12-04T11:11:09.6015835Z  * [new branch]                gh/PaulZhang12/42/base  -> origin/gh/PaulZhang12/42/base
2025-12-04T11:11:09.6016028Z  * [new branch]                gh/PaulZhang12/42/head  -> origin/gh/PaulZhang12/42/head
2025-12-04T11:11:09.6016227Z  * [new branch]                gh/PaulZhang12/43/base  -> origin/gh/PaulZhang12/43/base
2025-12-04T11:11:09.6016425Z  * [new branch]                gh/PaulZhang12/43/head  -> origin/gh/PaulZhang12/43/head
2025-12-04T11:11:09.6016617Z  * [new branch]                gh/PaulZhang12/43/orig  -> origin/gh/PaulZhang12/43/orig
2025-12-04T11:11:09.6016813Z  * [new branch]                gh/PaulZhang12/44/base  -> origin/gh/PaulZhang12/44/base
2025-12-04T11:11:09.6017012Z  * [new branch]                gh/PaulZhang12/44/head  -> origin/gh/PaulZhang12/44/head
2025-12-04T11:11:09.6017204Z  * [new branch]                gh/PaulZhang12/45/base  -> origin/gh/PaulZhang12/45/base
2025-12-04T11:11:09.6017400Z  * [new branch]                gh/PaulZhang12/45/head  -> origin/gh/PaulZhang12/45/head
2025-12-04T11:11:09.6017597Z  * [new branch]                gh/PaulZhang12/45/orig  -> origin/gh/PaulZhang12/45/orig
2025-12-04T11:11:09.6017788Z  * [new branch]                gh/PaulZhang12/46/base  -> origin/gh/PaulZhang12/46/base
2025-12-04T11:11:09.6017981Z  * [new branch]                gh/PaulZhang12/46/head  -> origin/gh/PaulZhang12/46/head
2025-12-04T11:11:09.6018313Z  * [new branch]                gh/PaulZhang12/46/orig  -> origin/gh/PaulZhang12/46/orig
2025-12-04T11:11:09.6018503Z  * [new branch]                gh/PaulZhang12/47/base  -> origin/gh/PaulZhang12/47/base
2025-12-04T11:11:09.6018696Z  * [new branch]                gh/PaulZhang12/47/head  -> origin/gh/PaulZhang12/47/head
2025-12-04T11:11:09.6018926Z  * [new branch]                gh/PaulZhang12/47/orig  -> origin/gh/PaulZhang12/47/orig
2025-12-04T11:11:09.6019117Z  * [new branch]                gh/PaulZhang12/48/base  -> origin/gh/PaulZhang12/48/base
2025-12-04T11:11:09.6019313Z  * [new branch]                gh/PaulZhang12/48/head  -> origin/gh/PaulZhang12/48/head
2025-12-04T11:11:09.6019544Z  * [new branch]                gh/PaulZhang12/48/orig  -> origin/gh/PaulZhang12/48/orig
2025-12-04T11:11:09.6019732Z  * [new branch]                gh/SamGinzburg/11/base  -> origin/gh/SamGinzburg/11/base
2025-12-04T11:11:09.6019926Z  * [new branch]                gh/SamGinzburg/11/head  -> origin/gh/SamGinzburg/11/head
2025-12-04T11:11:09.6020128Z  * [new branch]                gh/SherlockNoMad/1/base -> origin/gh/SherlockNoMad/1/base
2025-12-04T11:11:09.6020327Z  * [new branch]                gh/SherlockNoMad/1/head -> origin/gh/SherlockNoMad/1/head
2025-12-04T11:11:09.6020531Z  * [new branch]                gh/SherlockNoMad/10/base -> origin/gh/SherlockNoMad/10/base
2025-12-04T11:11:09.6020742Z  * [new branch]                gh/SherlockNoMad/10/head -> origin/gh/SherlockNoMad/10/head
2025-12-04T11:11:09.6020946Z  * [new branch]                gh/SherlockNoMad/10/orig -> origin/gh/SherlockNoMad/10/orig
2025-12-04T11:11:09.6021154Z  * [new branch]                gh/SherlockNoMad/11/base -> origin/gh/SherlockNoMad/11/base
2025-12-04T11:11:09.6021356Z  * [new branch]                gh/SherlockNoMad/11/head -> origin/gh/SherlockNoMad/11/head
2025-12-04T11:11:09.6021556Z  * [new branch]                gh/SherlockNoMad/11/orig -> origin/gh/SherlockNoMad/11/orig
2025-12-04T11:11:09.6021758Z  * [new branch]                gh/SherlockNoMad/12/base -> origin/gh/SherlockNoMad/12/base
2025-12-04T11:11:09.6021961Z  * [new branch]                gh/SherlockNoMad/12/head -> origin/gh/SherlockNoMad/12/head
2025-12-04T11:11:09.6022159Z  * [new branch]                gh/SherlockNoMad/12/orig -> origin/gh/SherlockNoMad/12/orig
2025-12-04T11:11:09.6022364Z  * [new branch]                gh/SherlockNoMad/15/base -> origin/gh/SherlockNoMad/15/base
2025-12-04T11:11:09.6022565Z  * [new branch]                gh/SherlockNoMad/15/head -> origin/gh/SherlockNoMad/15/head
2025-12-04T11:11:09.6022762Z  * [new branch]                gh/SherlockNoMad/15/orig -> origin/gh/SherlockNoMad/15/orig
2025-12-04T11:11:09.6022966Z  * [new branch]                gh/SherlockNoMad/17/base -> origin/gh/SherlockNoMad/17/base
2025-12-04T11:11:09.6023163Z  * [new branch]                gh/SherlockNoMad/17/head -> origin/gh/SherlockNoMad/17/head
2025-12-04T11:11:09.6023366Z  * [new branch]                gh/SherlockNoMad/17/orig -> origin/gh/SherlockNoMad/17/orig
2025-12-04T11:11:09.6023569Z  * [new branch]                gh/SherlockNoMad/18/base -> origin/gh/SherlockNoMad/18/base
2025-12-04T11:11:09.6023767Z  * [new branch]                gh/SherlockNoMad/18/head -> origin/gh/SherlockNoMad/18/head
2025-12-04T11:11:09.6023971Z  * [new branch]                gh/SherlockNoMad/18/orig -> origin/gh/SherlockNoMad/18/orig
2025-12-04T11:11:09.6024172Z  * [new branch]                gh/SherlockNoMad/19/base -> origin/gh/SherlockNoMad/19/base
2025-12-04T11:11:09.6024372Z  * [new branch]                gh/SherlockNoMad/19/head -> origin/gh/SherlockNoMad/19/head
2025-12-04T11:11:09.6024580Z  * [new branch]                gh/SherlockNoMad/19/orig -> origin/gh/SherlockNoMad/19/orig
2025-12-04T11:11:09.6024780Z  * [new branch]                gh/SherlockNoMad/2/base -> origin/gh/SherlockNoMad/2/base
2025-12-04T11:11:09.6024977Z  * [new branch]                gh/SherlockNoMad/2/head -> origin/gh/SherlockNoMad/2/head
2025-12-04T11:11:09.6025180Z  * [new branch]                gh/SherlockNoMad/20/base -> origin/gh/SherlockNoMad/20/base
2025-12-04T11:11:09.6025383Z  * [new branch]                gh/SherlockNoMad/20/head -> origin/gh/SherlockNoMad/20/head
2025-12-04T11:11:09.6025580Z  * [new branch]                gh/SherlockNoMad/20/orig -> origin/gh/SherlockNoMad/20/orig
2025-12-04T11:11:09.6025812Z  * [new branch]                gh/SherlockNoMad/21/base -> origin/gh/SherlockNoMad/21/base
2025-12-04T11:11:09.6026015Z  * [new branch]                gh/SherlockNoMad/21/head -> origin/gh/SherlockNoMad/21/head
2025-12-04T11:11:09.6026214Z  * [new branch]                gh/SherlockNoMad/21/orig -> origin/gh/SherlockNoMad/21/orig
2025-12-04T11:11:09.6026439Z  * [new branch]                gh/SherlockNoMad/3/base -> origin/gh/SherlockNoMad/3/base
2025-12-04T11:11:09.6026639Z  * [new branch]                gh/SherlockNoMad/3/head -> origin/gh/SherlockNoMad/3/head
2025-12-04T11:11:09.6026832Z  * [new branch]                gh/SherlockNoMad/4/base -> origin/gh/SherlockNoMad/4/base
2025-12-04T11:11:09.6027028Z  * [new branch]                gh/SherlockNoMad/4/head -> origin/gh/SherlockNoMad/4/head
2025-12-04T11:11:09.6027227Z  * [new branch]                gh/SherlockNoMad/5/base -> origin/gh/SherlockNoMad/5/base
2025-12-04T11:11:09.6027421Z  * [new branch]                gh/SherlockNoMad/5/head -> origin/gh/SherlockNoMad/5/head
2025-12-04T11:11:09.6027639Z  * [new branch]                gh/Sidharth123-cpu/24/base -> origin/gh/Sidharth123-cpu/24/base
2025-12-04T11:11:09.6027859Z  * [new branch]                gh/Sidharth123-cpu/25/base -> origin/gh/Sidharth123-cpu/25/base
2025-12-04T11:11:09.6028072Z  * [new branch]                gh/Sidharth123-cpu/26/base -> origin/gh/Sidharth123-cpu/26/base
2025-12-04T11:11:09.6028326Z  * [new branch]                gh/Sidharth123-cpu/27/base -> origin/gh/Sidharth123-cpu/27/base
2025-12-04T11:11:09.6028530Z  * [new branch]                gh/StrongerXi/1/base    -> origin/gh/StrongerXi/1/base
2025-12-04T11:11:09.6028721Z  * [new branch]                gh/StrongerXi/1/head    -> origin/gh/StrongerXi/1/head
2025-12-04T11:11:09.6028915Z  * [new branch]                gh/StrongerXi/71/base   -> origin/gh/StrongerXi/71/base
2025-12-04T11:11:09.6029110Z  * [new branch]                gh/StrongerXi/71/head   -> origin/gh/StrongerXi/71/head
2025-12-04T11:11:09.6029302Z  * [new branch]                gh/StrongerXi/72/base   -> origin/gh/StrongerXi/72/base
2025-12-04T11:11:09.6029497Z  * [new branch]                gh/StrongerXi/72/head   -> origin/gh/StrongerXi/72/head
2025-12-04T11:11:09.6029684Z  * [new branch]                gh/StrongerXi/73/base   -> origin/gh/StrongerXi/73/base
2025-12-04T11:11:09.6029880Z  * [new branch]                gh/StrongerXi/73/head   -> origin/gh/StrongerXi/73/head
2025-12-04T11:11:09.6030069Z  * [new branch]                gh/StrongerXi/73/orig   -> origin/gh/StrongerXi/73/orig
2025-12-04T11:11:09.6030256Z  * [new branch]                gh/XilunWu/160/base     -> origin/gh/XilunWu/160/base
2025-12-04T11:11:09.6030444Z  * [new branch]                gh/XilunWu/160/head     -> origin/gh/XilunWu/160/head
2025-12-04T11:11:09.6030629Z  * [new branch]                gh/XilunWu/160/orig     -> origin/gh/XilunWu/160/orig
2025-12-04T11:11:09.6030810Z  * [new branch]                gh/XilunWu/163/base     -> origin/gh/XilunWu/163/base
2025-12-04T11:11:09.6030994Z  * [new branch]                gh/XilunWu/163/head     -> origin/gh/XilunWu/163/head
2025-12-04T11:11:09.6031178Z  * [new branch]                gh/XilunWu/163/orig     -> origin/gh/XilunWu/163/orig
2025-12-04T11:11:09.6031358Z  * [new branch]                gh/XilunWu/168/base     -> origin/gh/XilunWu/168/base
2025-12-04T11:11:09.6031541Z  * [new branch]                gh/XilunWu/168/head     -> origin/gh/XilunWu/168/head
2025-12-04T11:11:09.6031725Z  * [new branch]                gh/XilunWu/168/orig     -> origin/gh/XilunWu/168/orig
2025-12-04T11:11:09.6031907Z  * [new branch]                gh/XilunWu/169/base     -> origin/gh/XilunWu/169/base
2025-12-04T11:11:09.6032091Z  * [new branch]                gh/XilunWu/169/head     -> origin/gh/XilunWu/169/head
2025-12-04T11:11:09.6032277Z  * [new branch]                gh/XilunWu/169/orig     -> origin/gh/XilunWu/169/orig
2025-12-04T11:11:09.6032457Z  * [new branch]                gh/XilunWu/170/base     -> origin/gh/XilunWu/170/base
2025-12-04T11:11:09.6032638Z  * [new branch]                gh/XilunWu/170/head     -> origin/gh/XilunWu/170/head
2025-12-04T11:11:09.6032857Z  * [new branch]                gh/XilunWu/170/orig     -> origin/gh/XilunWu/170/orig
2025-12-04T11:11:09.6033043Z  * [new branch]                gh/XilunWu/171/base     -> origin/gh/XilunWu/171/base
2025-12-04T11:11:09.6033254Z  * [new branch]                gh/XilunWu/171/head     -> origin/gh/XilunWu/171/head
2025-12-04T11:11:09.6033438Z  * [new branch]                gh/XilunWu/171/orig     -> origin/gh/XilunWu/171/orig
2025-12-04T11:11:09.6033621Z  * [new branch]                gh/XilunWu/173/base     -> origin/gh/XilunWu/173/base
2025-12-04T11:11:09.6033802Z  * [new branch]                gh/XilunWu/173/head     -> origin/gh/XilunWu/173/head
2025-12-04T11:11:09.6033981Z  * [new branch]                gh/XilunWu/173/orig     -> origin/gh/XilunWu/173/orig
2025-12-04T11:11:09.6034165Z  * [new branch]                gh/XilunWu/175/base     -> origin/gh/XilunWu/175/base
2025-12-04T11:11:09.6034349Z  * [new branch]                gh/XilunWu/175/head     -> origin/gh/XilunWu/175/head
2025-12-04T11:11:09.6034534Z  * [new branch]                gh/XilunWu/175/orig     -> origin/gh/XilunWu/175/orig
2025-12-04T11:11:09.6034716Z  * [new branch]                gh/XilunWu/176/base     -> origin/gh/XilunWu/176/base
2025-12-04T11:11:09.6034900Z  * [new branch]                gh/XilunWu/176/head     -> origin/gh/XilunWu/176/head
2025-12-04T11:11:09.6035080Z  * [new branch]                gh/XilunWu/176/orig     -> origin/gh/XilunWu/176/orig
2025-12-04T11:11:09.6035267Z  * [new branch]                gh/XuehaiPan/14/base    -> origin/gh/XuehaiPan/14/base
2025-12-04T11:11:09.6035459Z  * [new branch]                gh/XuehaiPan/14/head    -> origin/gh/XuehaiPan/14/head
2025-12-04T11:11:09.6035645Z  * [new branch]                gh/XuehaiPan/14/orig    -> origin/gh/XuehaiPan/14/orig
2025-12-04T11:11:09.6035838Z  * [new branch]                gh/XuehaiPan/179/base   -> origin/gh/XuehaiPan/179/base
2025-12-04T11:11:09.6036035Z  * [new branch]                gh/XuehaiPan/179/head   -> origin/gh/XuehaiPan/179/head
2025-12-04T11:11:09.6036221Z  * [new branch]                gh/XuehaiPan/179/orig   -> origin/gh/XuehaiPan/179/orig
2025-12-04T11:11:09.6036411Z  * [new branch]                gh/XuehaiPan/249/base   -> origin/gh/XuehaiPan/249/base
2025-12-04T11:11:09.6036608Z  * [new branch]                gh/XuehaiPan/249/head   -> origin/gh/XuehaiPan/249/head
2025-12-04T11:11:09.6036793Z  * [new branch]                gh/XuehaiPan/249/orig   -> origin/gh/XuehaiPan/249/orig
2025-12-04T11:11:09.6036984Z  * [new branch]                gh/XuehaiPan/253/base   -> origin/gh/XuehaiPan/253/base
2025-12-04T11:11:09.6037175Z  * [new branch]                gh/XuehaiPan/253/head   -> origin/gh/XuehaiPan/253/head
2025-12-04T11:11:09.6037365Z  * [new branch]                gh/XuehaiPan/253/orig   -> origin/gh/XuehaiPan/253/orig
2025-12-04T11:11:09.6037553Z  * [new branch]                gh/XuehaiPan/254/base   -> origin/gh/XuehaiPan/254/base
2025-12-04T11:11:09.6037743Z  * [new branch]                gh/XuehaiPan/254/head   -> origin/gh/XuehaiPan/254/head
2025-12-04T11:11:09.6037931Z  * [new branch]                gh/XuehaiPan/254/orig   -> origin/gh/XuehaiPan/254/orig
2025-12-04T11:11:09.6038120Z  * [new branch]                gh/XuehaiPan/255/base   -> origin/gh/XuehaiPan/255/base
2025-12-04T11:11:09.6038351Z  * [new branch]                gh/XuehaiPan/255/head   -> origin/gh/XuehaiPan/255/head
2025-12-04T11:11:09.6038538Z  * [new branch]                gh/XuehaiPan/255/orig   -> origin/gh/XuehaiPan/255/orig
2025-12-04T11:11:09.6038726Z  * [new branch]                gh/XuehaiPan/271/base   -> origin/gh/XuehaiPan/271/base
2025-12-04T11:11:09.6038912Z  * [new branch]                gh/XuehaiPan/271/head   -> origin/gh/XuehaiPan/271/head
2025-12-04T11:11:09.6039104Z  * [new branch]                gh/XuehaiPan/271/orig   -> origin/gh/XuehaiPan/271/orig
2025-12-04T11:11:09.6039297Z  * [new branch]                gh/XuehaiPan/343/base   -> origin/gh/XuehaiPan/343/base
2025-12-04T11:11:09.6039517Z  * [new branch]                gh/XuehaiPan/343/head   -> origin/gh/XuehaiPan/343/head
2025-12-04T11:11:09.6039707Z  * [new branch]                gh/XuehaiPan/343/orig   -> origin/gh/XuehaiPan/343/orig
2025-12-04T11:11:09.6039897Z  * [new branch]                gh/XuehaiPan/347/base   -> origin/gh/XuehaiPan/347/base
2025-12-04T11:11:09.6040127Z  * [new branch]                gh/XuehaiPan/347/head   -> origin/gh/XuehaiPan/347/head
2025-12-04T11:11:09.6040318Z  * [new branch]                gh/XuehaiPan/347/orig   -> origin/gh/XuehaiPan/347/orig
2025-12-04T11:11:09.6040507Z  * [new branch]                gh/XuehaiPan/348/base   -> origin/gh/XuehaiPan/348/base
2025-12-04T11:11:09.6040694Z  * [new branch]                gh/XuehaiPan/348/head   -> origin/gh/XuehaiPan/348/head
2025-12-04T11:11:09.6040883Z  * [new branch]                gh/XuehaiPan/348/orig   -> origin/gh/XuehaiPan/348/orig
2025-12-04T11:11:09.6041071Z  * [new branch]                gh/XuehaiPan/350/base   -> origin/gh/XuehaiPan/350/base
2025-12-04T11:11:09.6041259Z  * [new branch]                gh/XuehaiPan/350/head   -> origin/gh/XuehaiPan/350/head
2025-12-04T11:11:09.6041450Z  * [new branch]                gh/XuehaiPan/350/orig   -> origin/gh/XuehaiPan/350/orig
2025-12-04T11:11:09.6041638Z  * [new branch]                gh/XuehaiPan/365/base   -> origin/gh/XuehaiPan/365/base
2025-12-04T11:11:09.6041827Z  * [new branch]                gh/XuehaiPan/365/head   -> origin/gh/XuehaiPan/365/head
2025-12-04T11:11:09.6042015Z  * [new branch]                gh/XuehaiPan/365/orig   -> origin/gh/XuehaiPan/365/orig
2025-12-04T11:11:09.6042202Z  * [new branch]                gh/XuehaiPan/366/base   -> origin/gh/XuehaiPan/366/base
2025-12-04T11:11:09.6042391Z  * [new branch]                gh/XuehaiPan/366/head   -> origin/gh/XuehaiPan/366/head
2025-12-04T11:11:09.6042581Z  * [new branch]                gh/XuehaiPan/370/base   -> origin/gh/XuehaiPan/370/base
2025-12-04T11:11:09.6042768Z  * [new branch]                gh/XuehaiPan/370/head   -> origin/gh/XuehaiPan/370/head
2025-12-04T11:11:09.6042960Z  * [new branch]                gh/XuehaiPan/370/orig   -> origin/gh/XuehaiPan/370/orig
2025-12-04T11:11:09.6043151Z  * [new branch]                gh/XuehaiPan/390/base   -> origin/gh/XuehaiPan/390/base
2025-12-04T11:11:09.6043343Z  * [new branch]                gh/XuehaiPan/390/head   -> origin/gh/XuehaiPan/390/head
2025-12-04T11:11:09.6043533Z  * [new branch]                gh/XuehaiPan/390/orig   -> origin/gh/XuehaiPan/390/orig
2025-12-04T11:11:09.6043722Z  * [new branch]                gh/XuehaiPan/391/base   -> origin/gh/XuehaiPan/391/base
2025-12-04T11:11:09.6043908Z  * [new branch]                gh/XuehaiPan/391/head   -> origin/gh/XuehaiPan/391/head
2025-12-04T11:11:09.6044096Z  * [new branch]                gh/XuehaiPan/391/orig   -> origin/gh/XuehaiPan/391/orig
2025-12-04T11:11:09.6044286Z  * [new branch]                gh/XuehaiPan/392/base   -> origin/gh/XuehaiPan/392/base
2025-12-04T11:11:09.6044471Z  * [new branch]                gh/XuehaiPan/392/head   -> origin/gh/XuehaiPan/392/head
2025-12-04T11:11:09.6044666Z  * [new branch]                gh/XuehaiPan/392/orig   -> origin/gh/XuehaiPan/392/orig
2025-12-04T11:11:09.6044857Z  * [new branch]                gh/XuehaiPan/394/base   -> origin/gh/XuehaiPan/394/base
2025-12-04T11:11:09.6045046Z  * [new branch]                gh/XuehaiPan/394/head   -> origin/gh/XuehaiPan/394/head
2025-12-04T11:11:09.6045234Z  * [new branch]                gh/XuehaiPan/394/orig   -> origin/gh/XuehaiPan/394/orig
2025-12-04T11:11:09.6045425Z  * [new branch]                gh/XuehaiPan/397/base   -> origin/gh/XuehaiPan/397/base
2025-12-04T11:11:09.6045611Z  * [new branch]                gh/XuehaiPan/397/head   -> origin/gh/XuehaiPan/397/head
2025-12-04T11:11:09.6045807Z  * [new branch]                gh/XuehaiPan/397/orig   -> origin/gh/XuehaiPan/397/orig
2025-12-04T11:11:09.6045998Z  * [new branch]                gh/XuehaiPan/398/base   -> origin/gh/XuehaiPan/398/base
2025-12-04T11:11:09.6046213Z  * [new branch]                gh/XuehaiPan/398/head   -> origin/gh/XuehaiPan/398/head
2025-12-04T11:11:09.6046405Z  * [new branch]                gh/XuehaiPan/398/orig   -> origin/gh/XuehaiPan/398/orig
2025-12-04T11:11:09.6046591Z  * [new branch]                gh/XuehaiPan/399/base   -> origin/gh/XuehaiPan/399/base
2025-12-04T11:11:09.6046812Z  * [new branch]                gh/XuehaiPan/399/head   -> origin/gh/XuehaiPan/399/head
2025-12-04T11:11:09.6047001Z  * [new branch]                gh/XuehaiPan/399/orig   -> origin/gh/XuehaiPan/399/orig
2025-12-04T11:11:09.6047188Z  * [new branch]                gh/XuehaiPan/400/base   -> origin/gh/XuehaiPan/400/base
2025-12-04T11:11:09.6047376Z  * [new branch]                gh/XuehaiPan/400/head   -> origin/gh/XuehaiPan/400/head
2025-12-04T11:11:09.6047566Z  * [new branch]                gh/XuehaiPan/400/orig   -> origin/gh/XuehaiPan/400/orig
2025-12-04T11:11:09.6047762Z  * [new branch]                gh/ZhiweiYan-96/39/base -> origin/gh/ZhiweiYan-96/39/base
2025-12-04T11:11:09.6047964Z  * [new branch]                gh/ZhiweiYan-96/39/head -> origin/gh/ZhiweiYan-96/39/head
2025-12-04T11:11:09.6048201Z  * [new branch]                gh/ZhiweiYan-96/39/orig -> origin/gh/ZhiweiYan-96/39/orig
2025-12-04T11:11:09.6048392Z  * [new branch]                gh/ZhiweiYan-96/44/base -> origin/gh/ZhiweiYan-96/44/base
2025-12-04T11:11:09.6048591Z  * [new branch]                gh/ZhiweiYan-96/44/head -> origin/gh/ZhiweiYan-96/44/head
2025-12-04T11:11:09.6048785Z  * [new branch]                gh/ZhiweiYan-96/45/base -> origin/gh/ZhiweiYan-96/45/base
2025-12-04T11:11:09.6048975Z  * [new branch]                gh/ZhiweiYan-96/45/head -> origin/gh/ZhiweiYan-96/45/head
2025-12-04T11:11:09.6049166Z  * [new branch]                gh/ZhiweiYan-96/49/base -> origin/gh/ZhiweiYan-96/49/base
2025-12-04T11:11:09.6049358Z  * [new branch]                gh/ZhiweiYan-96/49/head -> origin/gh/ZhiweiYan-96/49/head
2025-12-04T11:11:09.6049547Z  * [new branch]                gh/ZhiweiYan-96/62/base -> origin/gh/ZhiweiYan-96/62/base
2025-12-04T11:11:09.6049741Z  * [new branch]                gh/ZhiweiYan-96/62/head -> origin/gh/ZhiweiYan-96/62/head
2025-12-04T11:11:09.6049936Z  * [new branch]                gh/ZhiweiYan-96/66/base -> origin/gh/ZhiweiYan-96/66/base
2025-12-04T11:11:09.6050128Z  * [new branch]                gh/ZhiweiYan-96/66/head -> origin/gh/ZhiweiYan-96/66/head
2025-12-04T11:11:09.6050321Z  * [new branch]                gh/ZhiweiYan-96/67/base -> origin/gh/ZhiweiYan-96/67/base
2025-12-04T11:11:09.6050514Z  * [new branch]                gh/ZhiweiYan-96/67/head -> origin/gh/ZhiweiYan-96/67/head
2025-12-04T11:11:09.6050705Z  * [new branch]                gh/ZhiweiYan-96/68/base -> origin/gh/ZhiweiYan-96/68/base
2025-12-04T11:11:09.6050898Z  * [new branch]                gh/ZhiweiYan-96/68/head -> origin/gh/ZhiweiYan-96/68/head
2025-12-04T11:11:09.6051087Z  * [new branch]                gh/ZhiweiYan-96/68/orig -> origin/gh/ZhiweiYan-96/68/orig
2025-12-04T11:11:09.6051278Z  * [new branch]                gh/aakhundov/1/base     -> origin/gh/aakhundov/1/base
2025-12-04T11:11:09.6051469Z  * [new branch]                gh/aakhundov/1/head     -> origin/gh/aakhundov/1/head
2025-12-04T11:11:09.6051652Z  * [new branch]                gh/aakhundov/2/base     -> origin/gh/aakhundov/2/base
2025-12-04T11:11:09.6051840Z  * [new branch]                gh/aakhundov/2/head     -> origin/gh/aakhundov/2/head
2025-12-04T11:11:09.6052029Z  * [new branch]                gh/aditew01/openblas    -> origin/gh/aditew01/openblas
2025-12-04T11:11:09.6052217Z  * [new branch]                gh/aditew01/sbgemm      -> origin/gh/aditew01/sbgemm
2025-12-04T11:11:09.6052404Z  * [new branch]                gh/aditew01/vecbf16     -> origin/gh/aditew01/vecbf16
2025-12-04T11:11:09.6052585Z  * [new branch]                gh/albanD/4/base        -> origin/gh/albanD/4/base
2025-12-04T11:11:09.6052764Z  * [new branch]                gh/albanD/4/head        -> origin/gh/albanD/4/head
2025-12-04T11:11:09.6052944Z  * [new branch]                gh/albanD/4/orig        -> origin/gh/albanD/4/orig
2025-12-04T11:11:09.6053249Z  * [new branch]                gh/alexbrauckmann/paddedtensor_faketensor_init -> origin/gh/alexbrauckmann/paddedtensor_faketensor_init
2025-12-04T11:11:09.6053526Z  * [new branch]                gh/alexsamardzic/12/base -> origin/gh/alexsamardzic/12/base
2025-12-04T11:11:09.6053770Z  * [new branch]                gh/alexsamardzic/12/head -> origin/gh/alexsamardzic/12/head
2025-12-04T11:11:09.6053981Z  * [new branch]                gh/alexsamardzic/12/orig -> origin/gh/alexsamardzic/12/orig
2025-12-04T11:11:09.6054183Z  * [new branch]                gh/alexsamardzic/14/base -> origin/gh/alexsamardzic/14/base
2025-12-04T11:11:09.6054388Z  * [new branch]                gh/alexsamardzic/14/head -> origin/gh/alexsamardzic/14/head
2025-12-04T11:11:09.6054595Z  * [new branch]                gh/alexsamardzic/14/orig -> origin/gh/alexsamardzic/14/orig
2025-12-04T11:11:09.6054797Z  * [new branch]                gh/alexsamardzic/15/base -> origin/gh/alexsamardzic/15/base
2025-12-04T11:11:09.6055004Z  * [new branch]                gh/alexsamardzic/15/head -> origin/gh/alexsamardzic/15/head
2025-12-04T11:11:09.6055210Z  * [new branch]                gh/alexsamardzic/15/orig -> origin/gh/alexsamardzic/15/orig
2025-12-04T11:11:09.6055410Z  * [new branch]                gh/amjames/18/base      -> origin/gh/amjames/18/base
2025-12-04T11:11:09.6055600Z  * [new branch]                gh/amjames/18/head      -> origin/gh/amjames/18/head
2025-12-04T11:11:09.6055786Z  * [new branch]                gh/amjames/18/orig      -> origin/gh/amjames/18/orig
2025-12-04T11:11:09.6055976Z  * [new branch]                gh/andrewor14/35/base   -> origin/gh/andrewor14/35/base
2025-12-04T11:11:09.6056173Z  * [new branch]                gh/andrewor14/35/head   -> origin/gh/andrewor14/35/head
2025-12-04T11:11:09.6056362Z  * [new branch]                gh/andrewor14/35/orig   -> origin/gh/andrewor14/35/orig
2025-12-04T11:11:09.6056555Z  * [new branch]                gh/andrewor14/50/base   -> origin/gh/andrewor14/50/base
2025-12-04T11:11:09.6056749Z  * [new branch]                gh/andrewor14/50/head   -> origin/gh/andrewor14/50/head
2025-12-04T11:11:09.6056938Z  * [new branch]                gh/andrewor14/50/orig   -> origin/gh/andrewor14/50/orig
2025-12-04T11:11:09.6057136Z  * [new branch]                gh/andyanwang/30/base   -> origin/gh/andyanwang/30/base
2025-12-04T11:11:09.6057330Z  * [new branch]                gh/andyanwang/30/orig   -> origin/gh/andyanwang/30/orig
2025-12-04T11:11:09.6057519Z  * [new branch]                gh/andyanwang/31/base   -> origin/gh/andyanwang/31/base
2025-12-04T11:11:09.6057717Z  * [new branch]                gh/andyanwang/31/orig   -> origin/gh/andyanwang/31/orig
2025-12-04T11:11:09.6057911Z  * [new branch]                gh/andyanwang/39/base   -> origin/gh/andyanwang/39/base
2025-12-04T11:11:09.6058101Z  * [new branch]                gh/andyanwang/39/head   -> origin/gh/andyanwang/39/head
2025-12-04T11:11:09.6058331Z  * [new branch]                gh/andyanwang/39/orig   -> origin/gh/andyanwang/39/orig
2025-12-04T11:11:09.6058527Z  * [new branch]                gh/andyanwang/42/base   -> origin/gh/andyanwang/42/base
2025-12-04T11:11:09.6058717Z  * [new branch]                gh/andyanwang/42/head   -> origin/gh/andyanwang/42/head
2025-12-04T11:11:09.6058916Z  * [new branch]                gh/andyanwang/42/orig   -> origin/gh/andyanwang/42/orig
2025-12-04T11:11:09.6059110Z  * [new branch]                gh/andyanwang/45/base   -> origin/gh/andyanwang/45/base
2025-12-04T11:11:09.6059299Z  * [new branch]                gh/andyanwang/45/head   -> origin/gh/andyanwang/45/head
2025-12-04T11:11:09.6059493Z  * [new branch]                gh/andyanwang/45/orig   -> origin/gh/andyanwang/45/orig
2025-12-04T11:11:09.6059686Z  * [new branch]                gh/angelayi/107/base    -> origin/gh/angelayi/107/base
2025-12-04T11:11:09.6059874Z  * [new branch]                gh/angelayi/107/head    -> origin/gh/angelayi/107/head
2025-12-04T11:11:09.6060101Z  * [new branch]                gh/angelayi/114/base    -> origin/gh/angelayi/114/base
2025-12-04T11:11:09.6060294Z  * [new branch]                gh/angelayi/114/head    -> origin/gh/angelayi/114/head
2025-12-04T11:11:09.6060480Z  * [new branch]                gh/angelayi/114/orig    -> origin/gh/angelayi/114/orig
2025-12-04T11:11:09.6060701Z  * [new branch]                gh/angelayi/116/base    -> origin/gh/angelayi/116/base
2025-12-04T11:11:09.6060891Z  * [new branch]                gh/angelayi/116/head    -> origin/gh/angelayi/116/head
2025-12-04T11:11:09.6061075Z  * [new branch]                gh/angelayi/116/orig    -> origin/gh/angelayi/116/orig
2025-12-04T11:11:09.6061268Z  * [new branch]                gh/angelayi/122/base    -> origin/gh/angelayi/122/base
2025-12-04T11:11:09.6061454Z  * [new branch]                gh/angelayi/122/head    -> origin/gh/angelayi/122/head
2025-12-04T11:11:09.6061647Z  * [new branch]                gh/angelayi/122/orig    -> origin/gh/angelayi/122/orig
2025-12-04T11:11:09.6061842Z  * [new branch]                gh/angelayi/124/base    -> origin/gh/angelayi/124/base
2025-12-04T11:11:09.6062030Z  * [new branch]                gh/angelayi/124/head    -> origin/gh/angelayi/124/head
2025-12-04T11:11:09.6062223Z  * [new branch]                gh/angelayi/124/orig    -> origin/gh/angelayi/124/orig
2025-12-04T11:11:09.6062423Z  * [new branch]                gh/angelayi/128/base    -> origin/gh/angelayi/128/base
2025-12-04T11:11:09.6062609Z  * [new branch]                gh/angelayi/128/head    -> origin/gh/angelayi/128/head
2025-12-04T11:11:09.6062800Z  * [new branch]                gh/angelayi/128/orig    -> origin/gh/angelayi/128/orig
2025-12-04T11:11:09.6062992Z  * [new branch]                gh/angelayi/131/base    -> origin/gh/angelayi/131/base
2025-12-04T11:11:09.6063179Z  * [new branch]                gh/angelayi/131/head    -> origin/gh/angelayi/131/head
2025-12-04T11:11:09.6063371Z  * [new branch]                gh/angelayi/131/orig    -> origin/gh/angelayi/131/orig
2025-12-04T11:11:09.6063566Z  * [new branch]                gh/angelayi/132/base    -> origin/gh/angelayi/132/base
2025-12-04T11:11:09.6063753Z  * [new branch]                gh/angelayi/132/head    -> origin/gh/angelayi/132/head
2025-12-04T11:11:09.6063946Z  * [new branch]                gh/angelayi/132/orig    -> origin/gh/angelayi/132/orig
2025-12-04T11:11:09.6064147Z  * [new branch]                gh/angelayi/133/base    -> origin/gh/angelayi/133/base
2025-12-04T11:11:09.6064335Z  * [new branch]                gh/angelayi/133/head    -> origin/gh/angelayi/133/head
2025-12-04T11:11:09.6064526Z  * [new branch]                gh/angelayi/133/orig    -> origin/gh/angelayi/133/orig
2025-12-04T11:11:09.6064719Z  * [new branch]                gh/angelayi/134/base    -> origin/gh/angelayi/134/base
2025-12-04T11:11:09.6064906Z  * [new branch]                gh/angelayi/134/head    -> origin/gh/angelayi/134/head
2025-12-04T11:11:09.6065099Z  * [new branch]                gh/angelayi/134/orig    -> origin/gh/angelayi/134/orig
2025-12-04T11:11:09.6065296Z  * [new branch]                gh/angelayi/135/base    -> origin/gh/angelayi/135/base
2025-12-04T11:11:09.6065484Z  * [new branch]                gh/angelayi/135/head    -> origin/gh/angelayi/135/head
2025-12-04T11:11:09.6065675Z  * [new branch]                gh/angelayi/135/orig    -> origin/gh/angelayi/135/orig
2025-12-04T11:11:09.6065864Z  * [new branch]                gh/angelayi/136/base    -> origin/gh/angelayi/136/base
2025-12-04T11:11:09.6066057Z  * [new branch]                gh/angelayi/136/head    -> origin/gh/angelayi/136/head
2025-12-04T11:11:09.6066250Z  * [new branch]                gh/angelayi/136/orig    -> origin/gh/angelayi/136/orig
2025-12-04T11:11:09.6066437Z  * [new branch]                gh/angelayi/137/base    -> origin/gh/angelayi/137/base
2025-12-04T11:11:09.6066630Z  * [new branch]                gh/angelayi/137/head    -> origin/gh/angelayi/137/head
2025-12-04T11:11:09.6066823Z  * [new branch]                gh/angelayi/137/orig    -> origin/gh/angelayi/137/orig
2025-12-04T11:11:09.6067036Z  * [new branch]                gh/angelayi/138/base    -> origin/gh/angelayi/138/base
2025-12-04T11:11:09.6067229Z  * [new branch]                gh/angelayi/138/head    -> origin/gh/angelayi/138/head
2025-12-04T11:11:09.6067421Z  * [new branch]                gh/angelayi/138/orig    -> origin/gh/angelayi/138/orig
2025-12-04T11:11:09.6067632Z  * [new branch]                gh/angelayi/139/base    -> origin/gh/angelayi/139/base
2025-12-04T11:11:09.6067825Z  * [new branch]                gh/angelayi/139/head    -> origin/gh/angelayi/139/head
2025-12-04T11:11:09.6068017Z  * [new branch]                gh/angelayi/139/orig    -> origin/gh/angelayi/139/orig
2025-12-04T11:11:09.6068239Z  * [new branch]                gh/angelayi/140/base    -> origin/gh/angelayi/140/base
2025-12-04T11:11:09.6068431Z  * [new branch]                gh/angelayi/140/head    -> origin/gh/angelayi/140/head
2025-12-04T11:11:09.6068623Z  * [new branch]                gh/angelayi/140/orig    -> origin/gh/angelayi/140/orig
2025-12-04T11:11:09.6068813Z  * [new branch]                gh/angelayi/141/base    -> origin/gh/angelayi/141/base
2025-12-04T11:11:09.6069006Z  * [new branch]                gh/angelayi/141/head    -> origin/gh/angelayi/141/head
2025-12-04T11:11:09.6069201Z  * [new branch]                gh/angelayi/141/orig    -> origin/gh/angelayi/141/orig
2025-12-04T11:11:09.6069388Z  * [new branch]                gh/angelayi/142/base    -> origin/gh/angelayi/142/base
2025-12-04T11:11:09.6069581Z  * [new branch]                gh/angelayi/142/head    -> origin/gh/angelayi/142/head
2025-12-04T11:11:09.6069771Z  * [new branch]                gh/angelayi/142/orig    -> origin/gh/angelayi/142/orig
2025-12-04T11:11:09.6069957Z  * [new branch]                gh/angelayi/143/base    -> origin/gh/angelayi/143/base
2025-12-04T11:11:09.6070208Z  * [new branch]                gh/angelayi/143/head    -> origin/gh/angelayi/143/head
2025-12-04T11:11:09.6070398Z  * [new branch]                gh/angelayi/143/orig    -> origin/gh/angelayi/143/orig
2025-12-04T11:11:09.6070595Z  * [new branch]                gh/angelayi/144/base    -> origin/gh/angelayi/144/base
2025-12-04T11:11:09.6070789Z  * [new branch]                gh/angelayi/144/head    -> origin/gh/angelayi/144/head
2025-12-04T11:11:09.6070981Z  * [new branch]                gh/angelayi/144/orig    -> origin/gh/angelayi/144/orig
2025-12-04T11:11:09.6071182Z  * [new branch]                gh/anijain2305/753/base -> origin/gh/anijain2305/753/base
2025-12-04T11:11:09.6071387Z  * [new branch]                gh/anijain2305/753/head -> origin/gh/anijain2305/753/head
2025-12-04T11:11:09.6071582Z  * [new branch]                gh/anijain2305/753/orig -> origin/gh/anijain2305/753/orig
2025-12-04T11:11:09.6071784Z  * [new branch]                gh/anijain2305/810/base -> origin/gh/anijain2305/810/base
2025-12-04T11:11:09.6071984Z  * [new branch]                gh/anijain2305/810/head -> origin/gh/anijain2305/810/head
2025-12-04T11:11:09.6072174Z  * [new branch]                gh/anijain2305/810/orig -> origin/gh/anijain2305/810/orig
2025-12-04T11:11:09.6072376Z  * [new branch]                gh/anijain2305/854/base -> origin/gh/anijain2305/854/base
2025-12-04T11:11:09.6072574Z  * [new branch]                gh/anijain2305/854/head -> origin/gh/anijain2305/854/head
2025-12-04T11:11:09.6072770Z  * [new branch]                gh/anijain2305/854/orig -> origin/gh/anijain2305/854/orig
2025-12-04T11:11:09.6072969Z  * [new branch]                gh/anijain2305/864/base -> origin/gh/anijain2305/864/base
2025-12-04T11:11:09.6073167Z  * [new branch]                gh/anijain2305/864/head -> origin/gh/anijain2305/864/head
2025-12-04T11:11:09.6073361Z  * [new branch]                gh/anijain2305/864/orig -> origin/gh/anijain2305/864/orig
2025-12-04T11:11:09.6073557Z  * [new branch]                gh/anijain2305/870/base -> origin/gh/anijain2305/870/base
2025-12-04T11:11:09.6073748Z  * [new branch]                gh/anijain2305/870/head -> origin/gh/anijain2305/870/head
2025-12-04T11:11:09.6073980Z  * [new branch]                gh/anijain2305/870/orig -> origin/gh/anijain2305/870/orig
2025-12-04T11:11:09.6074176Z  * [new branch]                gh/anijain2305/873/base -> origin/gh/anijain2305/873/base
2025-12-04T11:11:09.6074368Z  * [new branch]                gh/anijain2305/873/head -> origin/gh/anijain2305/873/head
2025-12-04T11:11:09.6074706Z  * [new branch]                gh/anijain2305/873/orig -> origin/gh/anijain2305/873/orig
2025-12-04T11:11:09.6074898Z  * [new branch]                gh/anijain2305/894/base -> origin/gh/anijain2305/894/base
2025-12-04T11:11:09.6075090Z  * [new branch]                gh/anijain2305/894/head -> origin/gh/anijain2305/894/head
2025-12-04T11:11:09.6075279Z  * [new branch]                gh/anijain2305/894/orig -> origin/gh/anijain2305/894/orig
2025-12-04T11:11:09.6075474Z  * [new branch]                gh/anijain2305/895/base -> origin/gh/anijain2305/895/base
2025-12-04T11:11:09.6075662Z  * [new branch]                gh/anijain2305/895/head -> origin/gh/anijain2305/895/head
2025-12-04T11:11:09.6075862Z  * [new branch]                gh/anijain2305/895/orig -> origin/gh/anijain2305/895/orig
2025-12-04T11:11:09.6076053Z  * [new branch]                gh/anijain2305/910/base -> origin/gh/anijain2305/910/base
2025-12-04T11:11:09.6076245Z  * [new branch]                gh/anijain2305/910/head -> origin/gh/anijain2305/910/head
2025-12-04T11:11:09.6076439Z  * [new branch]                gh/anijain2305/910/orig -> origin/gh/anijain2305/910/orig
2025-12-04T11:11:09.6076631Z  * [new branch]                gh/anijain2305/919/base -> origin/gh/anijain2305/919/base
2025-12-04T11:11:09.6076820Z  * [new branch]                gh/anijain2305/919/head -> origin/gh/anijain2305/919/head
2025-12-04T11:11:09.6077014Z  * [new branch]                gh/anijain2305/919/orig -> origin/gh/anijain2305/919/orig
2025-12-04T11:11:09.6077208Z  * [new branch]                gh/anijain2305/922/base -> origin/gh/anijain2305/922/base
2025-12-04T11:11:09.6077397Z  * [new branch]                gh/anijain2305/922/head -> origin/gh/anijain2305/922/head
2025-12-04T11:11:09.6077591Z  * [new branch]                gh/anijain2305/922/orig -> origin/gh/anijain2305/922/orig
2025-12-04T11:11:09.6077784Z  * [new branch]                gh/anijain2305/932/base -> origin/gh/anijain2305/932/base
2025-12-04T11:11:09.6077977Z  * [new branch]                gh/anijain2305/932/head -> origin/gh/anijain2305/932/head
2025-12-04T11:11:09.6078206Z  * [new branch]                gh/anijain2305/932/orig -> origin/gh/anijain2305/932/orig
2025-12-04T11:11:09.6078400Z  * [new branch]                gh/anijain2305/940/base -> origin/gh/anijain2305/940/base
2025-12-04T11:11:09.6078589Z  * [new branch]                gh/anijain2305/940/head -> origin/gh/anijain2305/940/head
2025-12-04T11:11:09.6078781Z  * [new branch]                gh/anijain2305/940/orig -> origin/gh/anijain2305/940/orig
2025-12-04T11:11:09.6078976Z  * [new branch]                gh/anijain2305/941/base -> origin/gh/anijain2305/941/base
2025-12-04T11:11:09.6079168Z  * [new branch]                gh/anijain2305/941/head -> origin/gh/anijain2305/941/head
2025-12-04T11:11:09.6079361Z  * [new branch]                gh/anijain2305/941/orig -> origin/gh/anijain2305/941/orig
2025-12-04T11:11:09.6079555Z  * [new branch]                gh/anijain2305/942/base -> origin/gh/anijain2305/942/base
2025-12-04T11:11:09.6079746Z  * [new branch]                gh/anijain2305/942/head -> origin/gh/anijain2305/942/head
2025-12-04T11:11:09.6079940Z  * [new branch]                gh/anijain2305/942/orig -> origin/gh/anijain2305/942/orig
2025-12-04T11:11:09.6080130Z  * [new branch]                gh/anijain2305/943/base -> origin/gh/anijain2305/943/base
2025-12-04T11:11:09.6080323Z  * [new branch]                gh/anijain2305/943/head -> origin/gh/anijain2305/943/head
2025-12-04T11:11:09.6080516Z  * [new branch]                gh/anijain2305/943/orig -> origin/gh/anijain2305/943/orig
2025-12-04T11:11:09.6080705Z  * [new branch]                gh/anijain2305/944/base -> origin/gh/anijain2305/944/base
2025-12-04T11:11:09.6080934Z  * [new branch]                gh/anijain2305/944/head -> origin/gh/anijain2305/944/head
2025-12-04T11:11:09.6081129Z  * [new branch]                gh/anijain2305/944/orig -> origin/gh/anijain2305/944/orig
2025-12-04T11:11:09.6081352Z  * [new branch]                gh/anijain2305/945/base -> origin/gh/anijain2305/945/base
2025-12-04T11:11:09.6081545Z  * [new branch]                gh/anijain2305/945/head -> origin/gh/anijain2305/945/head
2025-12-04T11:11:09.6081740Z  * [new branch]                gh/anijain2305/945/orig -> origin/gh/anijain2305/945/orig
2025-12-04T11:11:09.6081930Z  * [new branch]                gh/anijain2305/946/base -> origin/gh/anijain2305/946/base
2025-12-04T11:11:09.6082123Z  * [new branch]                gh/anijain2305/946/head -> origin/gh/anijain2305/946/head
2025-12-04T11:11:09.6082316Z  * [new branch]                gh/anijain2305/946/orig -> origin/gh/anijain2305/946/orig
2025-12-04T11:11:09.6082506Z  * [new branch]                gh/anijain2305/947/base -> origin/gh/anijain2305/947/base
2025-12-04T11:11:09.6082701Z  * [new branch]                gh/anijain2305/947/head -> origin/gh/anijain2305/947/head
2025-12-04T11:11:09.6082897Z  * [new branch]                gh/anijain2305/947/orig -> origin/gh/anijain2305/947/orig
2025-12-04T11:11:09.6083092Z  * [new branch]                gh/anijain2305/948/base -> origin/gh/anijain2305/948/base
2025-12-04T11:11:09.6083287Z  * [new branch]                gh/anijain2305/948/head -> origin/gh/anijain2305/948/head
2025-12-04T11:11:09.6083481Z  * [new branch]                gh/anijain2305/948/orig -> origin/gh/anijain2305/948/orig
2025-12-04T11:11:09.6083670Z  * [new branch]                gh/anijain2305/949/base -> origin/gh/anijain2305/949/base
2025-12-04T11:11:09.6083865Z  * [new branch]                gh/anijain2305/949/head -> origin/gh/anijain2305/949/head
2025-12-04T11:11:09.6084057Z  * [new branch]                gh/anijain2305/949/orig -> origin/gh/anijain2305/949/orig
2025-12-04T11:11:09.6084253Z  * [new branch]                gh/anijain2305/950/base -> origin/gh/anijain2305/950/base
2025-12-04T11:11:09.6084447Z  * [new branch]                gh/anijain2305/950/head -> origin/gh/anijain2305/950/head
2025-12-04T11:11:09.6084635Z  * [new branch]                gh/anijain2305/950/orig -> origin/gh/anijain2305/950/orig
2025-12-04T11:11:09.6084834Z  * [new branch]                gh/anijain2305/951/base -> origin/gh/anijain2305/951/base
2025-12-04T11:11:09.6085029Z  * [new branch]                gh/anijain2305/951/head -> origin/gh/anijain2305/951/head
2025-12-04T11:11:09.6085220Z  * [new branch]                gh/anijain2305/951/orig -> origin/gh/anijain2305/951/orig
2025-12-04T11:11:09.6085414Z  * [new branch]                gh/anijain2305/952/base -> origin/gh/anijain2305/952/base
2025-12-04T11:11:09.6085607Z  * [new branch]                gh/anijain2305/952/head -> origin/gh/anijain2305/952/head
2025-12-04T11:11:09.6085794Z  * [new branch]                gh/anijain2305/952/orig -> origin/gh/anijain2305/952/orig
2025-12-04T11:11:09.6085991Z  * [new branch]                gh/anijain2305/953/base -> origin/gh/anijain2305/953/base
2025-12-04T11:11:09.6086185Z  * [new branch]                gh/anijain2305/953/head -> origin/gh/anijain2305/953/head
2025-12-04T11:11:09.6086377Z  * [new branch]                gh/anijain2305/953/orig -> origin/gh/anijain2305/953/orig
2025-12-04T11:11:09.6086569Z  * [new branch]                gh/anijain2305/954/base -> origin/gh/anijain2305/954/base
2025-12-04T11:11:09.6086763Z  * [new branch]                gh/anijain2305/954/head -> origin/gh/anijain2305/954/head
2025-12-04T11:11:09.6086951Z  * [new branch]                gh/anijain2305/954/orig -> origin/gh/anijain2305/954/orig
2025-12-04T11:11:09.6087144Z  * [new branch]                gh/anijain2305/955/base -> origin/gh/anijain2305/955/base
2025-12-04T11:11:09.6087336Z  * [new branch]                gh/anijain2305/955/head -> origin/gh/anijain2305/955/head
2025-12-04T11:11:09.6087525Z  * [new branch]                gh/anijain2305/955/orig -> origin/gh/anijain2305/955/orig
2025-12-04T11:11:09.6087748Z  * [new branch]                gh/anijain2305/956/base -> origin/gh/anijain2305/956/base
2025-12-04T11:11:09.6087940Z  * [new branch]                gh/anijain2305/956/head -> origin/gh/anijain2305/956/head
2025-12-04T11:11:09.6088187Z  * [new branch]                gh/anijain2305/956/orig -> origin/gh/anijain2305/956/orig
2025-12-04T11:11:09.6088383Z  * [new branch]                gh/anijain2305/957/base -> origin/gh/anijain2305/957/base
2025-12-04T11:11:09.6088576Z  * [new branch]                gh/anijain2305/957/head -> origin/gh/anijain2305/957/head
2025-12-04T11:11:09.6088764Z  * [new branch]                gh/anijain2305/957/orig -> origin/gh/anijain2305/957/orig
2025-12-04T11:11:09.6088953Z  * [new branch]                gh/anijain2305/958/base -> origin/gh/anijain2305/958/base
2025-12-04T11:11:09.6089145Z  * [new branch]                gh/anijain2305/958/head -> origin/gh/anijain2305/958/head
2025-12-04T11:11:09.6089338Z  * [new branch]                gh/anijain2305/958/orig -> origin/gh/anijain2305/958/orig
2025-12-04T11:11:09.6089526Z  * [new branch]                gh/anijain2305/959/base -> origin/gh/anijain2305/959/base
2025-12-04T11:11:09.6089711Z  * [new branch]                gh/anijain2305/959/head -> origin/gh/anijain2305/959/head
2025-12-04T11:11:09.6089905Z  * [new branch]                gh/anijain2305/959/orig -> origin/gh/anijain2305/959/orig
2025-12-04T11:11:09.6090093Z  * [new branch]                gh/anijain2305/960/base -> origin/gh/anijain2305/960/base
2025-12-04T11:11:09.6090279Z  * [new branch]                gh/anijain2305/960/head -> origin/gh/anijain2305/960/head
2025-12-04T11:11:09.6090472Z  * [new branch]                gh/anijain2305/960/orig -> origin/gh/anijain2305/960/orig
2025-12-04T11:11:09.6090664Z  * [new branch]                gh/anijain2305/961/base -> origin/gh/anijain2305/961/base
2025-12-04T11:11:09.6090849Z  * [new branch]                gh/anijain2305/961/head -> origin/gh/anijain2305/961/head
2025-12-04T11:11:09.6091041Z  * [new branch]                gh/anijain2305/961/orig -> origin/gh/anijain2305/961/orig
2025-12-04T11:11:09.6091232Z  * [new branch]                gh/anijain2305/962/base -> origin/gh/anijain2305/962/base
2025-12-04T11:11:09.6091421Z  * [new branch]                gh/anijain2305/962/head -> origin/gh/anijain2305/962/head
2025-12-04T11:11:09.6091612Z  * [new branch]                gh/anijain2305/962/orig -> origin/gh/anijain2305/962/orig
2025-12-04T11:11:09.6091800Z  * [new branch]                gh/anijain2305/963/base -> origin/gh/anijain2305/963/base
2025-12-04T11:11:09.6091988Z  * [new branch]                gh/anijain2305/963/head -> origin/gh/anijain2305/963/head
2025-12-04T11:11:09.6092180Z  * [new branch]                gh/anijain2305/963/orig -> origin/gh/anijain2305/963/orig
2025-12-04T11:11:09.6092371Z  * [new branch]                gh/anijain2305/964/base -> origin/gh/anijain2305/964/base
2025-12-04T11:11:09.6092558Z  * [new branch]                gh/anijain2305/964/head -> origin/gh/anijain2305/964/head
2025-12-04T11:11:09.6092748Z  * [new branch]                gh/anijain2305/964/orig -> origin/gh/anijain2305/964/orig
2025-12-04T11:11:09.6092939Z  * [new branch]                gh/anijain2305/965/base -> origin/gh/anijain2305/965/base
2025-12-04T11:11:09.6093131Z  * [new branch]                gh/anijain2305/965/head -> origin/gh/anijain2305/965/head
2025-12-04T11:11:09.6093324Z  * [new branch]                gh/anijain2305/965/orig -> origin/gh/anijain2305/965/orig
2025-12-04T11:11:09.6093515Z  * [new branch]                gh/anijain2305/966/base -> origin/gh/anijain2305/966/base
2025-12-04T11:11:09.6093701Z  * [new branch]                gh/anijain2305/966/head -> origin/gh/anijain2305/966/head
2025-12-04T11:11:09.6093891Z  * [new branch]                gh/anijain2305/966/orig -> origin/gh/anijain2305/966/orig
2025-12-04T11:11:09.6094084Z  * [new branch]                gh/anijain2305/967/base -> origin/gh/anijain2305/967/base
2025-12-04T11:11:09.6094309Z  * [new branch]                gh/anijain2305/967/head -> origin/gh/anijain2305/967/head
2025-12-04T11:11:09.6094500Z  * [new branch]                gh/anijain2305/967/orig -> origin/gh/anijain2305/967/orig
2025-12-04T11:11:09.6094687Z  * [new branch]                gh/anijain2305/968/base -> origin/gh/anijain2305/968/base
2025-12-04T11:11:09.6094901Z  * [new branch]                gh/anijain2305/968/head -> origin/gh/anijain2305/968/head
2025-12-04T11:11:09.6095091Z  * [new branch]                gh/anijain2305/968/orig -> origin/gh/anijain2305/968/orig
2025-12-04T11:11:09.6095277Z  * [new branch]                gh/anijain2305/969/base -> origin/gh/anijain2305/969/base
2025-12-04T11:11:09.6095466Z  * [new branch]                gh/anijain2305/969/head -> origin/gh/anijain2305/969/head
2025-12-04T11:11:09.6095657Z  * [new branch]                gh/anijain2305/969/orig -> origin/gh/anijain2305/969/orig
2025-12-04T11:11:09.6095846Z  * [new branch]                gh/anijain2305/970/base -> origin/gh/anijain2305/970/base
2025-12-04T11:11:09.6096043Z  * [new branch]                gh/anijain2305/970/head -> origin/gh/anijain2305/970/head
2025-12-04T11:11:09.6096234Z  * [new branch]                gh/anijain2305/970/orig -> origin/gh/anijain2305/970/orig
2025-12-04T11:11:09.6096420Z  * [new branch]                gh/anjali411/216/base   -> origin/gh/anjali411/216/base
2025-12-04T11:11:09.6096608Z  * [new branch]                gh/anjali411/216/head   -> origin/gh/anjali411/216/head
2025-12-04T11:11:09.6096794Z  * [new branch]                gh/anjali411/216/orig   -> origin/gh/anjali411/216/orig
2025-12-04T11:11:09.6096979Z  * [new branch]                gh/anshul-si/1/base     -> origin/gh/anshul-si/1/base
2025-12-04T11:11:09.6097163Z  * [new branch]                gh/anshul-si/1/head     -> origin/gh/anshul-si/1/head
2025-12-04T11:11:09.6097344Z  * [new branch]                gh/anshul-si/2/base     -> origin/gh/anshul-si/2/base
2025-12-04T11:11:09.6097522Z  * [new branch]                gh/anshul-si/2/head     -> origin/gh/anshul-si/2/head
2025-12-04T11:11:09.6097705Z  * [new branch]                gh/anshul-si/3/base     -> origin/gh/anshul-si/3/base
2025-12-04T11:11:09.6097886Z  * [new branch]                gh/anshul-si/3/head     -> origin/gh/anshul-si/3/head
2025-12-04T11:11:09.6098065Z  * [new branch]                gh/anshul-si/4/base     -> origin/gh/anshul-si/4/base
2025-12-04T11:11:09.6098274Z  * [new branch]                gh/anshul-si/4/head     -> origin/gh/anshul-si/4/head
2025-12-04T11:11:09.6098455Z  * [new branch]                gh/anshul-si/5/base     -> origin/gh/anshul-si/5/base
2025-12-04T11:11:09.6098635Z  * [new branch]                gh/anshul-si/5/head     -> origin/gh/anshul-si/5/head
2025-12-04T11:11:09.6098818Z  * [new branch]                gh/anshul-si/53/base    -> origin/gh/anshul-si/53/base
2025-12-04T11:11:09.6099001Z  * [new branch]                gh/anshul-si/53/head    -> origin/gh/anshul-si/53/head
2025-12-04T11:11:09.6099185Z  * [new branch]                gh/anshul-si/58/base    -> origin/gh/anshul-si/58/base
2025-12-04T11:11:09.6099371Z  * [new branch]                gh/anshul-si/58/head    -> origin/gh/anshul-si/58/head
2025-12-04T11:11:09.6099551Z  * [new branch]                gh/anshul-si/66/base    -> origin/gh/anshul-si/66/base
2025-12-04T11:11:09.6099734Z  * [new branch]                gh/anshul-si/66/head    -> origin/gh/anshul-si/66/head
2025-12-04T11:11:09.6099916Z  * [new branch]                gh/anshul-si/66/orig    -> origin/gh/anshul-si/66/orig
2025-12-04T11:11:09.6100094Z  * [new branch]                gh/anshul-si/67/base    -> origin/gh/anshul-si/67/base
2025-12-04T11:11:09.6100276Z  * [new branch]                gh/anshul-si/67/head    -> origin/gh/anshul-si/67/head
2025-12-04T11:11:09.6100458Z  * [new branch]                gh/anshul-si/67/orig    -> origin/gh/anshul-si/67/orig
2025-12-04T11:11:09.6100638Z  * [new branch]                gh/anshul-si/68/base    -> origin/gh/anshul-si/68/base
2025-12-04T11:11:09.6100820Z  * [new branch]                gh/anshul-si/68/head    -> origin/gh/anshul-si/68/head
2025-12-04T11:11:09.6101043Z  * [new branch]                gh/anshul-si/68/orig    -> origin/gh/anshul-si/68/orig
2025-12-04T11:11:09.6101225Z  * [new branch]                gh/anshul-si/69/base    -> origin/gh/anshul-si/69/base
2025-12-04T11:11:09.6101438Z  * [new branch]                gh/anshul-si/69/head    -> origin/gh/anshul-si/69/head
2025-12-04T11:11:09.6101619Z  * [new branch]                gh/anshul-si/69/orig    -> origin/gh/anshul-si/69/orig
2025-12-04T11:11:09.6101799Z  * [new branch]                gh/anshul-si/70/base    -> origin/gh/anshul-si/70/base
2025-12-04T11:11:09.6101981Z  * [new branch]                gh/anshul-si/70/head    -> origin/gh/anshul-si/70/head
2025-12-04T11:11:09.6102163Z  * [new branch]                gh/anshul-si/70/orig    -> origin/gh/anshul-si/70/orig
2025-12-04T11:11:09.6102342Z  * [new branch]                gh/anshul-si/71/base    -> origin/gh/anshul-si/71/base
2025-12-04T11:11:09.6102523Z  * [new branch]                gh/anshul-si/71/head    -> origin/gh/anshul-si/71/head
2025-12-04T11:11:09.6102709Z  * [new branch]                gh/anshul-si/71/orig    -> origin/gh/anshul-si/71/orig
2025-12-04T11:11:09.6102887Z  * [new branch]                gh/anshul-si/72/base    -> origin/gh/anshul-si/72/base
2025-12-04T11:11:09.6103072Z  * [new branch]                gh/anshul-si/72/head    -> origin/gh/anshul-si/72/head
2025-12-04T11:11:09.6103252Z  * [new branch]                gh/anshul-si/72/orig    -> origin/gh/anshul-si/72/orig
2025-12-04T11:11:09.6103434Z  * [new branch]                gh/anshul-si/73/base    -> origin/gh/anshul-si/73/base
2025-12-04T11:11:09.6103615Z  * [new branch]                gh/anshul-si/73/head    -> origin/gh/anshul-si/73/head
2025-12-04T11:11:09.6103793Z  * [new branch]                gh/anshul-si/73/orig    -> origin/gh/anshul-si/73/orig
2025-12-04T11:11:09.6103976Z  * [new branch]                gh/aorenste/132/base    -> origin/gh/aorenste/132/base
2025-12-04T11:11:09.6104160Z  * [new branch]                gh/aorenste/132/head    -> origin/gh/aorenste/132/head
2025-12-04T11:11:09.6104344Z  * [new branch]                gh/aorenste/134/base    -> origin/gh/aorenste/134/base
2025-12-04T11:11:09.6104529Z  * [new branch]                gh/aorenste/134/head    -> origin/gh/aorenste/134/head
2025-12-04T11:11:09.6104720Z  * [new branch]                gh/aorenste/134/orig    -> origin/gh/aorenste/134/orig
2025-12-04T11:11:09.6104901Z  * [new branch]                gh/aorenste/139/base    -> origin/gh/aorenste/139/base
2025-12-04T11:11:09.6105084Z  * [new branch]                gh/aorenste/139/head    -> origin/gh/aorenste/139/head
2025-12-04T11:11:09.6105278Z  * [new branch]                gh/aorenste/139/orig    -> origin/gh/aorenste/139/orig
2025-12-04T11:11:09.6105459Z  * [new branch]                gh/aorenste/141/base    -> origin/gh/aorenste/141/base
2025-12-04T11:11:09.6105644Z  * [new branch]                gh/aorenste/141/head    -> origin/gh/aorenste/141/head
2025-12-04T11:11:09.6105834Z  * [new branch]                gh/aorenste/145/base    -> origin/gh/aorenste/145/base
2025-12-04T11:11:09.6106021Z  * [new branch]                gh/aorenste/145/head    -> origin/gh/aorenste/145/head
2025-12-04T11:11:09.6106210Z  * [new branch]                gh/aorenste/145/orig    -> origin/gh/aorenste/145/orig
2025-12-04T11:11:09.6106407Z  * [new branch]                gh/aorenste/146/base    -> origin/gh/aorenste/146/base
2025-12-04T11:11:09.6106593Z  * [new branch]                gh/aorenste/146/head    -> origin/gh/aorenste/146/head
2025-12-04T11:11:09.6106782Z  * [new branch]                gh/aorenste/146/orig    -> origin/gh/aorenste/146/orig
2025-12-04T11:11:09.6106971Z  * [new branch]                gh/aorenste/147/base    -> origin/gh/aorenste/147/base
2025-12-04T11:11:09.6107153Z  * [new branch]                gh/aorenste/147/head    -> origin/gh/aorenste/147/head
2025-12-04T11:11:09.6107339Z  * [new branch]                gh/aorenste/147/orig    -> origin/gh/aorenste/147/orig
2025-12-04T11:11:09.6107551Z  * [new branch]                gh/aorenste/148/base    -> origin/gh/aorenste/148/base
2025-12-04T11:11:09.6107737Z  * [new branch]                gh/aorenste/148/head    -> origin/gh/aorenste/148/head
2025-12-04T11:11:09.6107927Z  * [new branch]                gh/aorenste/148/orig    -> origin/gh/aorenste/148/orig
2025-12-04T11:11:09.6108136Z  * [new branch]                gh/aorenste/149/base    -> origin/gh/aorenste/149/base
2025-12-04T11:11:09.6108376Z  * [new branch]                gh/aorenste/149/head    -> origin/gh/aorenste/149/head
2025-12-04T11:11:09.6108563Z  * [new branch]                gh/aorenste/149/orig    -> origin/gh/aorenste/149/orig
2025-12-04T11:11:09.6108748Z  * [new branch]                gh/aorenste/150/base    -> origin/gh/aorenste/150/base
2025-12-04T11:11:09.6108934Z  * [new branch]                gh/aorenste/150/head    -> origin/gh/aorenste/150/head
2025-12-04T11:11:09.6109118Z  * [new branch]                gh/aorenste/150/orig    -> origin/gh/aorenste/150/orig
2025-12-04T11:11:09.6109303Z  * [new branch]                gh/aorenste/151/base    -> origin/gh/aorenste/151/base
2025-12-04T11:11:09.6109488Z  * [new branch]                gh/aorenste/151/head    -> origin/gh/aorenste/151/head
2025-12-04T11:11:09.6109677Z  * [new branch]                gh/aorenste/151/orig    -> origin/gh/aorenste/151/orig
2025-12-04T11:11:09.6109867Z  * [new branch]                gh/aorenste/152/base    -> origin/gh/aorenste/152/base
2025-12-04T11:11:09.6110053Z  * [new branch]                gh/aorenste/152/head    -> origin/gh/aorenste/152/head
2025-12-04T11:11:09.6110244Z  * [new branch]                gh/aorenste/152/orig    -> origin/gh/aorenste/152/orig
2025-12-04T11:11:09.6110430Z  * [new branch]                gh/aorenste/153/base    -> origin/gh/aorenste/153/base
2025-12-04T11:11:09.6110619Z  * [new branch]                gh/aorenste/153/head    -> origin/gh/aorenste/153/head
2025-12-04T11:11:09.6110806Z  * [new branch]                gh/aorenste/153/orig    -> origin/gh/aorenste/153/orig
2025-12-04T11:11:09.6110990Z  * [new branch]                gh/aorenste/154/base    -> origin/gh/aorenste/154/base
2025-12-04T11:11:09.6111173Z  * [new branch]                gh/aorenste/154/head    -> origin/gh/aorenste/154/head
2025-12-04T11:11:09.6111355Z  * [new branch]                gh/aorenste/154/orig    -> origin/gh/aorenste/154/orig
2025-12-04T11:11:09.6111547Z  * [new branch]                gh/aorenste/155/base    -> origin/gh/aorenste/155/base
2025-12-04T11:11:09.6111735Z  * [new branch]                gh/aorenste/155/head    -> origin/gh/aorenste/155/head
2025-12-04T11:11:09.6111916Z  * [new branch]                gh/aorenste/155/orig    -> origin/gh/aorenste/155/orig
2025-12-04T11:11:09.6112100Z  * [new branch]                gh/aorenste/156/base    -> origin/gh/aorenste/156/base
2025-12-04T11:11:09.6112285Z  * [new branch]                gh/aorenste/156/head    -> origin/gh/aorenste/156/head
2025-12-04T11:11:09.6112468Z  * [new branch]                gh/aorenste/156/orig    -> origin/gh/aorenste/156/orig
2025-12-04T11:11:09.6112656Z  * [new branch]                gh/aorenste/157/base    -> origin/gh/aorenste/157/base
2025-12-04T11:11:09.6112840Z  * [new branch]                gh/aorenste/157/head    -> origin/gh/aorenste/157/head
2025-12-04T11:11:09.6113022Z  * [new branch]                gh/aorenste/157/orig    -> origin/gh/aorenste/157/orig
2025-12-04T11:11:09.6113212Z  * [new branch]                gh/aorenste/158/base    -> origin/gh/aorenste/158/base
2025-12-04T11:11:09.6113401Z  * [new branch]                gh/aorenste/158/head    -> origin/gh/aorenste/158/head
2025-12-04T11:11:09.6113586Z  * [new branch]                gh/aorenste/158/orig    -> origin/gh/aorenste/158/orig
2025-12-04T11:11:09.6113774Z  * [new branch]                gh/aorenste/159/base    -> origin/gh/aorenste/159/base
2025-12-04T11:11:09.6113963Z  * [new branch]                gh/aorenste/159/head    -> origin/gh/aorenste/159/head
2025-12-04T11:11:09.6114144Z  * [new branch]                gh/aorenste/159/orig    -> origin/gh/aorenste/159/orig
2025-12-04T11:11:09.6114378Z  * [new branch]                gh/avikchaudhuri/1/base -> origin/gh/avikchaudhuri/1/base
2025-12-04T11:11:09.6114582Z  * [new branch]                gh/avikchaudhuri/1/head -> origin/gh/avikchaudhuri/1/head
2025-12-04T11:11:09.6114808Z  * [new branch]                gh/avikchaudhuri/2/base -> origin/gh/avikchaudhuri/2/base
2025-12-04T11:11:09.6115005Z  * [new branch]                gh/avikchaudhuri/2/head -> origin/gh/avikchaudhuri/2/head
2025-12-04T11:11:09.6115200Z  * [new branch]                gh/avikchaudhuri/2/orig -> origin/gh/avikchaudhuri/2/orig
2025-12-04T11:11:09.6115388Z  * [new branch]                gh/bdhirsh/666/base     -> origin/gh/bdhirsh/666/base
2025-12-04T11:11:09.6115578Z  * [new branch]                gh/bdhirsh/666/head     -> origin/gh/bdhirsh/666/head
2025-12-04T11:11:09.6115762Z  * [new branch]                gh/bdhirsh/666/orig     -> origin/gh/bdhirsh/666/orig
2025-12-04T11:11:09.6115949Z  * [new branch]                gh/bdhirsh/668/base     -> origin/gh/bdhirsh/668/base
2025-12-04T11:11:09.6116136Z  * [new branch]                gh/bdhirsh/668/head     -> origin/gh/bdhirsh/668/head
2025-12-04T11:11:09.6116318Z  * [new branch]                gh/bdhirsh/668/orig     -> origin/gh/bdhirsh/668/orig
2025-12-04T11:11:09.6116501Z  * [new branch]                gh/bdhirsh/669/base     -> origin/gh/bdhirsh/669/base
2025-12-04T11:11:09.6116681Z  * [new branch]                gh/bdhirsh/669/head     -> origin/gh/bdhirsh/669/head
2025-12-04T11:11:09.6116862Z  * [new branch]                gh/bdhirsh/669/orig     -> origin/gh/bdhirsh/669/orig
2025-12-04T11:11:09.6117046Z  * [new branch]                gh/bdhirsh/670/base     -> origin/gh/bdhirsh/670/base
2025-12-04T11:11:09.6117227Z  * [new branch]                gh/bdhirsh/670/head     -> origin/gh/bdhirsh/670/head
2025-12-04T11:11:09.6117406Z  * [new branch]                gh/bdhirsh/670/orig     -> origin/gh/bdhirsh/670/orig
2025-12-04T11:11:09.6117587Z  * [new branch]                gh/bdhirsh/672/base     -> origin/gh/bdhirsh/672/base
2025-12-04T11:11:09.6117770Z  * [new branch]                gh/bdhirsh/672/head     -> origin/gh/bdhirsh/672/head
2025-12-04T11:11:09.6117948Z  * [new branch]                gh/bdhirsh/672/orig     -> origin/gh/bdhirsh/672/orig
2025-12-04T11:11:09.6118136Z  * [new branch]                gh/bdhirsh/675/base     -> origin/gh/bdhirsh/675/base
2025-12-04T11:11:09.6118361Z  * [new branch]                gh/bdhirsh/675/head     -> origin/gh/bdhirsh/675/head
2025-12-04T11:11:09.6118544Z  * [new branch]                gh/bdhirsh/675/orig     -> origin/gh/bdhirsh/675/orig
2025-12-04T11:11:09.6118731Z  * [new branch]                gh/bdhirsh/676/base     -> origin/gh/bdhirsh/676/base
2025-12-04T11:11:09.6118917Z  * [new branch]                gh/bdhirsh/676/head     -> origin/gh/bdhirsh/676/head
2025-12-04T11:11:09.6119100Z  * [new branch]                gh/bdhirsh/676/orig     -> origin/gh/bdhirsh/676/orig
2025-12-04T11:11:09.6119179Z  * [new branch]                gh/bdhirsh/677/base     -> origin/gh/bdhirsh/677/base
2025-12-04T11:11:09.6119252Z  * [new branch]                gh/bdhirsh/677/head     -> origin/gh/bdhirsh/677/head
2025-12-04T11:11:09.6119323Z  * [new branch]                gh/bdhirsh/677/orig     -> origin/gh/bdhirsh/677/orig
2025-12-04T11:11:09.6119401Z  * [new branch]                gh/bdhirsh/678/base     -> origin/gh/bdhirsh/678/base
2025-12-04T11:11:09.6119471Z  * [new branch]                gh/bdhirsh/678/head     -> origin/gh/bdhirsh/678/head
2025-12-04T11:11:09.6119542Z  * [new branch]                gh/bdhirsh/678/orig     -> origin/gh/bdhirsh/678/orig
2025-12-04T11:11:09.6119616Z  * [new branch]                gh/bdhirsh/679/base     -> origin/gh/bdhirsh/679/base
2025-12-04T11:11:09.6119687Z  * [new branch]                gh/bdhirsh/679/head     -> origin/gh/bdhirsh/679/head
2025-12-04T11:11:09.6119758Z  * [new branch]                gh/bdhirsh/679/orig     -> origin/gh/bdhirsh/679/orig
2025-12-04T11:11:09.6119831Z  * [new branch]                gh/bdhirsh/680/base     -> origin/gh/bdhirsh/680/base
2025-12-04T11:11:09.6119936Z  * [new branch]                gh/bdhirsh/680/head     -> origin/gh/bdhirsh/680/head
2025-12-04T11:11:09.6120008Z  * [new branch]                gh/bdhirsh/680/orig     -> origin/gh/bdhirsh/680/orig
2025-12-04T11:11:09.6120125Z  * [new branch]                gh/bdhirsh/681/base     -> origin/gh/bdhirsh/681/base
2025-12-04T11:11:09.6120196Z  * [new branch]                gh/bdhirsh/681/head     -> origin/gh/bdhirsh/681/head
2025-12-04T11:11:09.6120270Z  * [new branch]                gh/bdhirsh/681/orig     -> origin/gh/bdhirsh/681/orig
2025-12-04T11:11:09.6120369Z  * [new branch]                gh/benjaminglass1/101/base -> origin/gh/benjaminglass1/101/base
2025-12-04T11:11:09.6120462Z  * [new branch]                gh/benjaminglass1/101/head -> origin/gh/benjaminglass1/101/head
2025-12-04T11:11:09.6120552Z  * [new branch]                gh/benjaminglass1/101/orig -> origin/gh/benjaminglass1/101/orig
2025-12-04T11:11:09.6120650Z  * [new branch]                gh/benjaminglass1/102/base -> origin/gh/benjaminglass1/102/base
2025-12-04T11:11:09.6120739Z  * [new branch]                gh/benjaminglass1/102/head -> origin/gh/benjaminglass1/102/head
2025-12-04T11:11:09.6120829Z  * [new branch]                gh/benjaminglass1/102/orig -> origin/gh/benjaminglass1/102/orig
2025-12-04T11:11:09.6120917Z  * [new branch]                gh/benjaminglass1/106/base -> origin/gh/benjaminglass1/106/base
2025-12-04T11:11:09.6121003Z  * [new branch]                gh/benjaminglass1/106/head -> origin/gh/benjaminglass1/106/head
2025-12-04T11:11:09.6121092Z  * [new branch]                gh/benjaminglass1/106/orig -> origin/gh/benjaminglass1/106/orig
2025-12-04T11:11:09.6121179Z  * [new branch]                gh/benjaminglass1/107/base -> origin/gh/benjaminglass1/107/base
2025-12-04T11:11:09.6121266Z  * [new branch]                gh/benjaminglass1/107/head -> origin/gh/benjaminglass1/107/head
2025-12-04T11:11:09.6121360Z  * [new branch]                gh/benjaminglass1/107/orig -> origin/gh/benjaminglass1/107/orig
2025-12-04T11:11:09.6121452Z  * [new branch]                gh/benjaminglass1/108/base -> origin/gh/benjaminglass1/108/base
2025-12-04T11:11:09.6121539Z  * [new branch]                gh/benjaminglass1/108/head -> origin/gh/benjaminglass1/108/head
2025-12-04T11:11:09.6121630Z  * [new branch]                gh/benjaminglass1/108/orig -> origin/gh/benjaminglass1/108/orig
2025-12-04T11:11:09.6121716Z  * [new branch]                gh/benjaminglass1/109/base -> origin/gh/benjaminglass1/109/base
2025-12-04T11:11:09.6121802Z  * [new branch]                gh/benjaminglass1/109/head -> origin/gh/benjaminglass1/109/head
2025-12-04T11:11:09.6121892Z  * [new branch]                gh/benjaminglass1/109/orig -> origin/gh/benjaminglass1/109/orig
2025-12-04T11:11:09.6121981Z  * [new branch]                gh/benjaminglass1/97/base -> origin/gh/benjaminglass1/97/base
2025-12-04T11:11:09.6122069Z  * [new branch]                gh/benjaminglass1/97/head -> origin/gh/benjaminglass1/97/head
2025-12-04T11:11:09.6122160Z  * [new branch]                gh/benjaminglass1/97/orig -> origin/gh/benjaminglass1/97/orig
2025-12-04T11:11:09.6122242Z  * [new branch]                gh/bobrenjc93/570/base  -> origin/gh/bobrenjc93/570/base
2025-12-04T11:11:09.6122327Z  * [new branch]                gh/bobrenjc93/570/head  -> origin/gh/bobrenjc93/570/head
2025-12-04T11:11:09.6122404Z  * [new branch]                gh/bobrenjc93/570/orig  -> origin/gh/bobrenjc93/570/orig
2025-12-04T11:11:09.6122482Z  * [new branch]                gh/bobrenjc93/604/base  -> origin/gh/bobrenjc93/604/base
2025-12-04T11:11:09.6122563Z  * [new branch]                gh/bobrenjc93/604/head  -> origin/gh/bobrenjc93/604/head
2025-12-04T11:11:09.6122639Z  * [new branch]                gh/bobrenjc93/604/orig  -> origin/gh/bobrenjc93/604/orig
2025-12-04T11:11:09.6122716Z  * [new branch]                gh/bobrenjc93/638/base  -> origin/gh/bobrenjc93/638/base
2025-12-04T11:11:09.6122797Z  * [new branch]                gh/bobrenjc93/638/head  -> origin/gh/bobrenjc93/638/head
2025-12-04T11:11:09.6122891Z  * [new branch]                gh/bobrenjc93/638/orig  -> origin/gh/bobrenjc93/638/orig
2025-12-04T11:11:09.6122966Z  * [new branch]                gh/bobrenjc93/653/base  -> origin/gh/bobrenjc93/653/base
2025-12-04T11:11:09.6123069Z  * [new branch]                gh/bobrenjc93/653/head  -> origin/gh/bobrenjc93/653/head
2025-12-04T11:11:09.6123146Z  * [new branch]                gh/bobrenjc93/653/orig  -> origin/gh/bobrenjc93/653/orig
2025-12-04T11:11:09.6123221Z  * [new branch]                gh/bobrenjc93/654/base  -> origin/gh/bobrenjc93/654/base
2025-12-04T11:11:09.6123305Z  * [new branch]                gh/bobrenjc93/654/head  -> origin/gh/bobrenjc93/654/head
2025-12-04T11:11:09.6123381Z  * [new branch]                gh/bobrenjc93/654/orig  -> origin/gh/bobrenjc93/654/orig
2025-12-04T11:11:09.6123457Z  * [new branch]                gh/bobrenjc93/657/base  -> origin/gh/bobrenjc93/657/base
2025-12-04T11:11:09.6123539Z  * [new branch]                gh/bobrenjc93/657/head  -> origin/gh/bobrenjc93/657/head
2025-12-04T11:11:09.6123615Z  * [new branch]                gh/bobrenjc93/657/orig  -> origin/gh/bobrenjc93/657/orig
2025-12-04T11:11:09.6152903Z  * [new branch]                gh/bobrenjc93/672/base  -> origin/gh/bobrenjc93/672/base
2025-12-04T11:11:09.6153027Z  * [new branch]                gh/bobrenjc93/672/head  -> origin/gh/bobrenjc93/672/head
2025-12-04T11:11:09.6153108Z  * [new branch]                gh/bobrenjc93/672/orig  -> origin/gh/bobrenjc93/672/orig
2025-12-04T11:11:09.6153184Z  * [new branch]                gh/bobrenjc93/679/base  -> origin/gh/bobrenjc93/679/base
2025-12-04T11:11:09.6153259Z  * [new branch]                gh/bobrenjc93/679/head  -> origin/gh/bobrenjc93/679/head
2025-12-04T11:11:09.6153332Z  * [new branch]                gh/bobrenjc93/679/orig  -> origin/gh/bobrenjc93/679/orig
2025-12-04T11:11:09.6153425Z  * [new branch]                gh/bobrenjc93/680/base  -> origin/gh/bobrenjc93/680/base
2025-12-04T11:11:09.6153503Z  * [new branch]                gh/bobrenjc93/680/head  -> origin/gh/bobrenjc93/680/head
2025-12-04T11:11:09.6153589Z  * [new branch]                gh/bobrenjc93/680/orig  -> origin/gh/bobrenjc93/680/orig
2025-12-04T11:11:09.6153664Z  * [new branch]                gh/bobrenjc93/681/base  -> origin/gh/bobrenjc93/681/base
2025-12-04T11:11:09.6153766Z  * [new branch]                gh/bobrenjc93/681/head  -> origin/gh/bobrenjc93/681/head
2025-12-04T11:11:09.6153852Z  * [new branch]                gh/bobrenjc93/681/orig  -> origin/gh/bobrenjc93/681/orig
2025-12-04T11:11:09.6153940Z  * [new branch]                gh/bobrenjc93/682/base  -> origin/gh/bobrenjc93/682/base
2025-12-04T11:11:09.6154032Z  * [new branch]                gh/bobrenjc93/682/head  -> origin/gh/bobrenjc93/682/head
2025-12-04T11:11:09.6154117Z  * [new branch]                gh/bobrenjc93/682/orig  -> origin/gh/bobrenjc93/682/orig
2025-12-04T11:11:09.6154202Z  * [new branch]                gh/bobrenjc93/683/base  -> origin/gh/bobrenjc93/683/base
2025-12-04T11:11:09.6154283Z  * [new branch]                gh/bobrenjc93/683/head  -> origin/gh/bobrenjc93/683/head
2025-12-04T11:11:09.6154397Z  * [new branch]                gh/bobrenjc93/683/orig  -> origin/gh/bobrenjc93/683/orig
2025-12-04T11:11:09.6154488Z  * [new branch]                gh/bobrenjc93/684/base  -> origin/gh/bobrenjc93/684/base
2025-12-04T11:11:09.6154564Z  * [new branch]                gh/bobrenjc93/684/head  -> origin/gh/bobrenjc93/684/head
2025-12-04T11:11:09.6154639Z  * [new branch]                gh/bobrenjc93/684/orig  -> origin/gh/bobrenjc93/684/orig
2025-12-04T11:11:09.6154720Z  * [new branch]                gh/bobrenjc93/685/base  -> origin/gh/bobrenjc93/685/base
2025-12-04T11:11:09.6154796Z  * [new branch]                gh/bobrenjc93/685/head  -> origin/gh/bobrenjc93/685/head
2025-12-04T11:11:09.6154873Z  * [new branch]                gh/bobrenjc93/685/orig  -> origin/gh/bobrenjc93/685/orig
2025-12-04T11:11:09.6154954Z  * [new branch]                gh/bobrenjc93/686/base  -> origin/gh/bobrenjc93/686/base
2025-12-04T11:11:09.6155100Z  * [new branch]                gh/bobrenjc93/686/head  -> origin/gh/bobrenjc93/686/head
2025-12-04T11:11:09.6155178Z  * [new branch]                gh/bobrenjc93/686/orig  -> origin/gh/bobrenjc93/686/orig
2025-12-04T11:11:09.6155286Z  * [new branch]                gh/bobrenjc93/687/base  -> origin/gh/bobrenjc93/687/base
2025-12-04T11:11:09.6155363Z  * [new branch]                gh/bobrenjc93/687/head  -> origin/gh/bobrenjc93/687/head
2025-12-04T11:11:09.6155439Z  * [new branch]                gh/bobrenjc93/687/orig  -> origin/gh/bobrenjc93/687/orig
2025-12-04T11:11:09.6155522Z  * [new branch]                gh/bobrenjc93/688/base  -> origin/gh/bobrenjc93/688/base
2025-12-04T11:11:09.6155598Z  * [new branch]                gh/bobrenjc93/688/head  -> origin/gh/bobrenjc93/688/head
2025-12-04T11:11:09.6155673Z  * [new branch]                gh/bobrenjc93/688/orig  -> origin/gh/bobrenjc93/688/orig
2025-12-04T11:11:09.6155762Z  * [new branch]                gh/bobrenjc93/689/base  -> origin/gh/bobrenjc93/689/base
2025-12-04T11:11:09.6155838Z  * [new branch]                gh/bobrenjc93/689/head  -> origin/gh/bobrenjc93/689/head
2025-12-04T11:11:09.6155913Z  * [new branch]                gh/bobrenjc93/689/orig  -> origin/gh/bobrenjc93/689/orig
2025-12-04T11:11:09.6155997Z  * [new branch]                gh/bobrenjc93/690/base  -> origin/gh/bobrenjc93/690/base
2025-12-04T11:11:09.6156073Z  * [new branch]                gh/bobrenjc93/690/head  -> origin/gh/bobrenjc93/690/head
2025-12-04T11:11:09.6156154Z  * [new branch]                gh/bobrenjc93/690/orig  -> origin/gh/bobrenjc93/690/orig
2025-12-04T11:11:09.6156230Z  * [new branch]                gh/bobrenjc93/691/base  -> origin/gh/bobrenjc93/691/base
2025-12-04T11:11:09.6156307Z  * [new branch]                gh/bobrenjc93/691/head  -> origin/gh/bobrenjc93/691/head
2025-12-04T11:11:09.6156386Z  * [new branch]                gh/bobrenjc93/691/orig  -> origin/gh/bobrenjc93/691/orig
2025-12-04T11:11:09.6156464Z  * [new branch]                gh/bobrenjc93/692/base  -> origin/gh/bobrenjc93/692/base
2025-12-04T11:11:09.6156540Z  * [new branch]                gh/bobrenjc93/692/head  -> origin/gh/bobrenjc93/692/head
2025-12-04T11:11:09.6156622Z  * [new branch]                gh/bobrenjc93/692/orig  -> origin/gh/bobrenjc93/692/orig
2025-12-04T11:11:09.6156700Z  * [new branch]                gh/bobrenjc93/693/base  -> origin/gh/bobrenjc93/693/base
2025-12-04T11:11:09.6156776Z  * [new branch]                gh/bobrenjc93/693/head  -> origin/gh/bobrenjc93/693/head
2025-12-04T11:11:09.6156858Z  * [new branch]                gh/bobrenjc93/693/orig  -> origin/gh/bobrenjc93/693/orig
2025-12-04T11:11:09.6156934Z  * [new branch]                gh/bobrenjc93/694/base  -> origin/gh/bobrenjc93/694/base
2025-12-04T11:11:09.6157010Z  * [new branch]                gh/bobrenjc93/694/head  -> origin/gh/bobrenjc93/694/head
2025-12-04T11:11:09.6157091Z  * [new branch]                gh/bobrenjc93/694/orig  -> origin/gh/bobrenjc93/694/orig
2025-12-04T11:11:09.6157165Z  * [new branch]                gh/bobrenjc93/695/base  -> origin/gh/bobrenjc93/695/base
2025-12-04T11:11:09.6157240Z  * [new branch]                gh/bobrenjc93/695/head  -> origin/gh/bobrenjc93/695/head
2025-12-04T11:11:09.6157326Z  * [new branch]                gh/bobrenjc93/695/orig  -> origin/gh/bobrenjc93/695/orig
2025-12-04T11:11:09.6157399Z  * [new branch]                gh/c00w/23/base         -> origin/gh/c00w/23/base
2025-12-04T11:11:09.6157469Z  * [new branch]                gh/c00w/23/head         -> origin/gh/c00w/23/head
2025-12-04T11:11:09.6157542Z  * [new branch]                gh/c00w/53/base         -> origin/gh/c00w/53/base
2025-12-04T11:11:09.6157610Z  * [new branch]                gh/c00w/53/head         -> origin/gh/c00w/53/head
2025-12-04T11:11:09.6157677Z  * [new branch]                gh/c00w/53/orig         -> origin/gh/c00w/53/orig
2025-12-04T11:11:09.6157749Z  * [new branch]                gh/c00w/54/base         -> origin/gh/c00w/54/base
2025-12-04T11:11:09.6157842Z  * [new branch]                gh/c00w/54/head         -> origin/gh/c00w/54/head
2025-12-04T11:11:09.6157915Z  * [new branch]                gh/c00w/54/orig         -> origin/gh/c00w/54/orig
2025-12-04T11:11:09.6157981Z  * [new branch]                gh/c00w/56/base         -> origin/gh/c00w/56/base
2025-12-04T11:11:09.6158070Z  * [new branch]                gh/c00w/56/head         -> origin/gh/c00w/56/head
2025-12-04T11:11:09.6158142Z  * [new branch]                gh/c00w/56/orig         -> origin/gh/c00w/56/orig
2025-12-04T11:11:09.6158245Z  * [new branch]                gh/c00w/57/base         -> origin/gh/c00w/57/base
2025-12-04T11:11:09.6158312Z  * [new branch]                gh/c00w/57/head         -> origin/gh/c00w/57/head
2025-12-04T11:11:09.6158384Z  * [new branch]                gh/c00w/57/orig         -> origin/gh/c00w/57/orig
2025-12-04T11:11:09.6158447Z  * [new branch]                gh/c00w/58/base         -> origin/gh/c00w/58/base
2025-12-04T11:11:09.6158514Z  * [new branch]                gh/c00w/58/head         -> origin/gh/c00w/58/head
2025-12-04T11:11:09.6158587Z  * [new branch]                gh/c00w/58/orig         -> origin/gh/c00w/58/orig
2025-12-04T11:11:09.6158666Z  * [new branch]                gh/clee2000/1/base      -> origin/gh/clee2000/1/base
2025-12-04T11:11:09.6158747Z  * [new branch]                gh/clee2000/1/head      -> origin/gh/clee2000/1/head
2025-12-04T11:11:09.6158825Z  * [new branch]                gh/clee2000/1/orig      -> origin/gh/clee2000/1/orig
2025-12-04T11:11:09.6158911Z  * [new branch]                gh/coconutruben/1/base  -> origin/gh/coconutruben/1/base
2025-12-04T11:11:09.6158992Z  * [new branch]                gh/coconutruben/1/head  -> origin/gh/coconutruben/1/head
2025-12-04T11:11:09.6159084Z  * [new branch]                gh/coconutruben/55/base -> origin/gh/coconutruben/55/base
2025-12-04T11:11:09.6159166Z  * [new branch]                gh/coconutruben/55/head -> origin/gh/coconutruben/55/head
2025-12-04T11:11:09.6159247Z  * [new branch]                gh/coconutruben/55/orig -> origin/gh/coconutruben/55/orig
2025-12-04T11:11:09.6159332Z  * [new branch]                gh/coconutruben/57/base -> origin/gh/coconutruben/57/base
2025-12-04T11:11:09.6159411Z  * [new branch]                gh/coconutruben/57/head -> origin/gh/coconutruben/57/head
2025-12-04T11:11:09.6159494Z  * [new branch]                gh/coconutruben/57/orig -> origin/gh/coconutruben/57/orig
2025-12-04T11:11:09.6159578Z  * [new branch]                gh/coconutruben/70/base -> origin/gh/coconutruben/70/base
2025-12-04T11:11:09.6159655Z  * [new branch]                gh/coconutruben/70/head -> origin/gh/coconutruben/70/head
2025-12-04T11:11:09.6159735Z  * [new branch]                gh/coconutruben/70/orig -> origin/gh/coconutruben/70/orig
2025-12-04T11:11:09.6159821Z  * [new branch]                gh/coconutruben/71/base -> origin/gh/coconutruben/71/base
2025-12-04T11:11:09.6159900Z  * [new branch]                gh/coconutruben/71/head -> origin/gh/coconutruben/71/head
2025-12-04T11:11:09.6159987Z  * [new branch]                gh/coconutruben/71/orig -> origin/gh/coconutruben/71/orig
2025-12-04T11:11:09.6160067Z  * [new branch]                gh/coconutruben/72/base -> origin/gh/coconutruben/72/base
2025-12-04T11:11:09.6160148Z  * [new branch]                gh/coconutruben/72/head -> origin/gh/coconutruben/72/head
2025-12-04T11:11:09.6160231Z  * [new branch]                gh/coconutruben/72/orig -> origin/gh/coconutruben/72/orig
2025-12-04T11:11:09.6160310Z  * [new branch]                gh/coconutruben/73/base -> origin/gh/coconutruben/73/base
2025-12-04T11:11:09.6160389Z  * [new branch]                gh/coconutruben/73/head -> origin/gh/coconutruben/73/head
2025-12-04T11:11:09.6160474Z  * [new branch]                gh/coconutruben/73/orig -> origin/gh/coconutruben/73/orig
2025-12-04T11:11:09.6160553Z  * [new branch]                gh/coconutruben/74/base -> origin/gh/coconutruben/74/base
2025-12-04T11:11:09.6160632Z  * [new branch]                gh/coconutruben/74/head -> origin/gh/coconutruben/74/head
2025-12-04T11:11:09.6160747Z  * [new branch]                gh/coconutruben/74/orig -> origin/gh/coconutruben/74/orig
2025-12-04T11:11:09.6160824Z  * [new branch]                gh/coconutruben/79/base -> origin/gh/coconutruben/79/base
2025-12-04T11:11:09.6160947Z  * [new branch]                gh/coconutruben/79/head -> origin/gh/coconutruben/79/head
2025-12-04T11:11:09.6161032Z  * [new branch]                gh/coconutruben/79/orig -> origin/gh/coconutruben/79/orig
2025-12-04T11:11:09.6161112Z  * [new branch]                gh/coconutruben/80/base -> origin/gh/coconutruben/80/base
2025-12-04T11:11:09.6161192Z  * [new branch]                gh/coconutruben/80/head -> origin/gh/coconutruben/80/head
2025-12-04T11:11:09.6161278Z  * [new branch]                gh/coconutruben/80/orig -> origin/gh/coconutruben/80/orig
2025-12-04T11:11:09.6161358Z  * [new branch]                gh/coconutruben/82/base -> origin/gh/coconutruben/82/base
2025-12-04T11:11:09.6161438Z  * [new branch]                gh/coconutruben/82/head -> origin/gh/coconutruben/82/head
2025-12-04T11:11:09.6161524Z  * [new branch]                gh/coconutruben/82/orig -> origin/gh/coconutruben/82/orig
2025-12-04T11:11:09.6161602Z  * [new branch]                gh/coconutruben/83/base -> origin/gh/coconutruben/83/base
2025-12-04T11:11:09.6161690Z  * [new branch]                gh/coconutruben/83/head -> origin/gh/coconutruben/83/head
2025-12-04T11:11:09.6161769Z  * [new branch]                gh/coconutruben/83/orig -> origin/gh/coconutruben/83/orig
2025-12-04T11:11:09.6161847Z  * [new branch]                gh/coconutruben/84/base -> origin/gh/coconutruben/84/base
2025-12-04T11:11:09.6161932Z  * [new branch]                gh/coconutruben/84/head -> origin/gh/coconutruben/84/head
2025-12-04T11:11:09.6162012Z  * [new branch]                gh/coconutruben/84/orig -> origin/gh/coconutruben/84/orig
2025-12-04T11:11:09.6162091Z  * [new branch]                gh/coconutruben/85/base -> origin/gh/coconutruben/85/base
2025-12-04T11:11:09.6162174Z  * [new branch]                gh/coconutruben/85/head -> origin/gh/coconutruben/85/head
2025-12-04T11:11:09.6162255Z  * [new branch]                gh/coconutruben/85/orig -> origin/gh/coconutruben/85/orig
2025-12-04T11:11:09.6162338Z  * [new branch]                gh/coconutruben/86/base -> origin/gh/coconutruben/86/base
2025-12-04T11:11:09.6162424Z  * [new branch]                gh/coconutruben/86/head -> origin/gh/coconutruben/86/head
2025-12-04T11:11:09.6162503Z  * [new branch]                gh/coconutruben/86/orig -> origin/gh/coconutruben/86/orig
2025-12-04T11:11:09.6162587Z  * [new branch]                gh/colinchan15/1/base   -> origin/gh/colinchan15/1/base
2025-12-04T11:11:09.6162672Z  * [new branch]                gh/colinchan15/1/head   -> origin/gh/colinchan15/1/head
2025-12-04T11:11:09.6162750Z  * [new branch]                gh/colinchan15/2/base   -> origin/gh/colinchan15/2/base
2025-12-04T11:11:09.6162827Z  * [new branch]                gh/colinchan15/2/head   -> origin/gh/colinchan15/2/head
2025-12-04T11:11:09.6162910Z  * [new branch]                gh/colinchan15/3/base   -> origin/gh/colinchan15/3/base
2025-12-04T11:11:09.6162986Z  * [new branch]                gh/colinchan15/3/head   -> origin/gh/colinchan15/3/head
2025-12-04T11:11:09.6163065Z  * [new branch]                gh/colinchan15/6/base   -> origin/gh/colinchan15/6/base
2025-12-04T11:11:09.6163147Z  * [new branch]                gh/colinchan15/6/head   -> origin/gh/colinchan15/6/head
2025-12-04T11:11:09.6163218Z  * [new branch]                gh/d4l3k/1/base         -> origin/gh/d4l3k/1/base
2025-12-04T11:11:09.6163294Z  * [new branch]                gh/d4l3k/1/head         -> origin/gh/d4l3k/1/head
2025-12-04T11:11:09.6163362Z  * [new branch]                gh/d4l3k/2/base         -> origin/gh/d4l3k/2/base
2025-12-04T11:11:09.6163429Z  * [new branch]                gh/d4l3k/2/head         -> origin/gh/d4l3k/2/head
2025-12-04T11:11:09.6163501Z  * [new branch]                gh/d4l3k/2/orig         -> origin/gh/d4l3k/2/orig
2025-12-04T11:11:09.6163592Z  * [new branch]                gh/d4l3k/3/base         -> origin/gh/d4l3k/3/base
2025-12-04T11:11:09.6163657Z  * [new branch]                gh/d4l3k/3/head         -> origin/gh/d4l3k/3/head
2025-12-04T11:11:09.6163752Z  * [new branch]                gh/d4l3k/3/orig         -> origin/gh/d4l3k/3/orig
2025-12-04T11:11:09.6163820Z  * [new branch]                gh/d4l3k/4/base         -> origin/gh/d4l3k/4/base
2025-12-04T11:11:09.6163901Z  * [new branch]                gh/d4l3k/4/head         -> origin/gh/d4l3k/4/head
2025-12-04T11:11:09.6163974Z  * [new branch]                gh/d4l3k/4/orig         -> origin/gh/d4l3k/4/orig
2025-12-04T11:11:09.6164042Z  * [new branch]                gh/d4l3k/5/base         -> origin/gh/d4l3k/5/base
2025-12-04T11:11:09.6164109Z  * [new branch]                gh/d4l3k/5/orig         -> origin/gh/d4l3k/5/orig
2025-12-04T11:11:09.6164210Z  * [new branch]                gh/davidberard98/392/base -> origin/gh/davidberard98/392/base
2025-12-04T11:11:09.6164302Z  * [new branch]                gh/davidberard98/392/head -> origin/gh/davidberard98/392/head
2025-12-04T11:11:09.6164389Z  * [new branch]                gh/davidberard98/392/orig -> origin/gh/davidberard98/392/orig
2025-12-04T11:11:09.6164482Z  * [new branch]                gh/davidberard98/399/base -> origin/gh/davidberard98/399/base
2025-12-04T11:11:09.6164568Z  * [new branch]                gh/davidberard98/399/head -> origin/gh/davidberard98/399/head
2025-12-04T11:11:09.6164653Z  * [new branch]                gh/davidberard98/399/orig -> origin/gh/davidberard98/399/orig
2025-12-04T11:11:09.6164738Z  * [new branch]                gh/desertfire/605/base  -> origin/gh/desertfire/605/base
2025-12-04T11:11:09.6164814Z  * [new branch]                gh/desertfire/605/head  -> origin/gh/desertfire/605/head
2025-12-04T11:11:09.6164888Z  * [new branch]                gh/desertfire/605/orig  -> origin/gh/desertfire/605/orig
2025-12-04T11:11:09.6164965Z  * [new branch]                gh/desertfire/606/base  -> origin/gh/desertfire/606/base
2025-12-04T11:11:09.6165044Z  * [new branch]                gh/desertfire/606/head  -> origin/gh/desertfire/606/head
2025-12-04T11:11:09.6165127Z  * [new branch]                gh/desertfire/606/orig  -> origin/gh/desertfire/606/orig
2025-12-04T11:11:09.6165205Z  * [new branch]                gh/desertfire/607/base  -> origin/gh/desertfire/607/base
2025-12-04T11:11:09.6165282Z  * [new branch]                gh/desertfire/607/head  -> origin/gh/desertfire/607/head
2025-12-04T11:11:09.6165362Z  * [new branch]                gh/desertfire/607/orig  -> origin/gh/desertfire/607/orig
2025-12-04T11:11:09.6165439Z  * [new branch]                gh/desertfire/608/base  -> origin/gh/desertfire/608/base
2025-12-04T11:11:09.6165516Z  * [new branch]                gh/desertfire/608/head  -> origin/gh/desertfire/608/head
2025-12-04T11:11:09.6165598Z  * [new branch]                gh/desertfire/608/orig  -> origin/gh/desertfire/608/orig
2025-12-04T11:11:09.6165676Z  * [new branch]                gh/desertfire/609/base  -> origin/gh/desertfire/609/base
2025-12-04T11:11:09.6165752Z  * [new branch]                gh/desertfire/609/head  -> origin/gh/desertfire/609/head
2025-12-04T11:11:09.6165835Z  * [new branch]                gh/desertfire/609/orig  -> origin/gh/desertfire/609/orig
2025-12-04T11:11:09.6165913Z  * [new branch]                gh/desertfire/610/base  -> origin/gh/desertfire/610/base
2025-12-04T11:11:09.6165990Z  * [new branch]                gh/desertfire/610/head  -> origin/gh/desertfire/610/head
2025-12-04T11:11:09.6166069Z  * [new branch]                gh/desertfire/610/orig  -> origin/gh/desertfire/610/orig
2025-12-04T11:11:09.6166147Z  * [new branch]                gh/desertfire/611/base  -> origin/gh/desertfire/611/base
2025-12-04T11:11:09.6166224Z  * [new branch]                gh/desertfire/611/head  -> origin/gh/desertfire/611/head
2025-12-04T11:11:09.6166306Z  * [new branch]                gh/desertfire/611/orig  -> origin/gh/desertfire/611/orig
2025-12-04T11:11:09.6166402Z  * [new branch]                gh/desertfire/612/base  -> origin/gh/desertfire/612/base
2025-12-04T11:11:09.6166481Z  * [new branch]                gh/desertfire/612/head  -> origin/gh/desertfire/612/head
2025-12-04T11:11:09.6166564Z  * [new branch]                gh/desertfire/612/orig  -> origin/gh/desertfire/612/orig
2025-12-04T11:11:09.6166668Z  * [new branch]                gh/desertfire/613/base  -> origin/gh/desertfire/613/base
2025-12-04T11:11:09.6166745Z  * [new branch]                gh/desertfire/613/head  -> origin/gh/desertfire/613/head
2025-12-04T11:11:09.6166827Z  * [new branch]                gh/desertfire/613/orig  -> origin/gh/desertfire/613/orig
2025-12-04T11:11:09.6166904Z  * [new branch]                gh/desertfire/614/base  -> origin/gh/desertfire/614/base
2025-12-04T11:11:09.6166986Z  * [new branch]                gh/desertfire/614/head  -> origin/gh/desertfire/614/head
2025-12-04T11:11:09.6167063Z  * [new branch]                gh/desertfire/614/orig  -> origin/gh/desertfire/614/orig
2025-12-04T11:11:09.6167141Z  * [new branch]                gh/desertfire/615/base  -> origin/gh/desertfire/615/base
2025-12-04T11:11:09.6167224Z  * [new branch]                gh/desertfire/615/head  -> origin/gh/desertfire/615/head
2025-12-04T11:11:09.6167303Z  * [new branch]                gh/desertfire/615/orig  -> origin/gh/desertfire/615/orig
2025-12-04T11:11:09.6167378Z  * [new branch]                gh/desertfire/616/base  -> origin/gh/desertfire/616/base
2025-12-04T11:11:09.6167459Z  * [new branch]                gh/desertfire/616/head  -> origin/gh/desertfire/616/head
2025-12-04T11:11:09.6167534Z  * [new branch]                gh/desertfire/616/orig  -> origin/gh/desertfire/616/orig
2025-12-04T11:11:09.6167610Z  * [new branch]                gh/desertfire/617/base  -> origin/gh/desertfire/617/base
2025-12-04T11:11:09.6167685Z  * [new branch]                gh/desertfire/617/head  -> origin/gh/desertfire/617/head
2025-12-04T11:11:09.6167763Z  * [new branch]                gh/desertfire/617/orig  -> origin/gh/desertfire/617/orig
2025-12-04T11:11:09.6167838Z  * [new branch]                gh/dharakk/1/base       -> origin/gh/dharakk/1/base
2025-12-04T11:11:09.6167911Z  * [new branch]                gh/dharakk/1/head       -> origin/gh/dharakk/1/head
2025-12-04T11:11:09.6167993Z  * [new branch]                gh/drisspg/170/base     -> origin/gh/drisspg/170/base
2025-12-04T11:11:09.6168067Z  * [new branch]                gh/drisspg/170/head     -> origin/gh/drisspg/170/head
2025-12-04T11:11:09.6168139Z  * [new branch]                gh/drisspg/170/orig     -> origin/gh/drisspg/170/orig
2025-12-04T11:11:09.6168270Z  * [new branch]                gh/drisspg/182/base     -> origin/gh/drisspg/182/base
2025-12-04T11:11:09.6168342Z  * [new branch]                gh/drisspg/182/head     -> origin/gh/drisspg/182/head
2025-12-04T11:11:09.6168417Z  * [new branch]                gh/drisspg/183/base     -> origin/gh/drisspg/183/base
2025-12-04T11:11:09.6168488Z  * [new branch]                gh/drisspg/183/head     -> origin/gh/drisspg/183/head
2025-12-04T11:11:09.6168557Z  * [new branch]                gh/drisspg/184/base     -> origin/gh/drisspg/184/base
2025-12-04T11:11:09.6168633Z  * [new branch]                gh/drisspg/184/head     -> origin/gh/drisspg/184/head
2025-12-04T11:11:09.6168706Z  * [new branch]                gh/drisspg/185/base     -> origin/gh/drisspg/185/base
2025-12-04T11:11:09.6168778Z  * [new branch]                gh/drisspg/185/head     -> origin/gh/drisspg/185/head
2025-12-04T11:11:09.6168855Z  * [new branch]                gh/drisspg/194/base     -> origin/gh/drisspg/194/base
2025-12-04T11:11:09.6168926Z  * [new branch]                gh/drisspg/194/head     -> origin/gh/drisspg/194/head
2025-12-04T11:11:09.6168998Z  * [new branch]                gh/drisspg/194/orig     -> origin/gh/drisspg/194/orig
2025-12-04T11:11:09.6169072Z  * [new branch]                gh/drisspg/200/base     -> origin/gh/drisspg/200/base
2025-12-04T11:11:09.6169145Z  * [new branch]                gh/drisspg/200/head     -> origin/gh/drisspg/200/head
2025-12-04T11:11:09.6169248Z  * [new branch]                gh/drisspg/200/orig     -> origin/gh/drisspg/200/orig
2025-12-04T11:11:09.6169325Z  * [new branch]                gh/drisspg/218/base     -> origin/gh/drisspg/218/base
2025-12-04T11:11:09.6169423Z  * [new branch]                gh/drisspg/218/head     -> origin/gh/drisspg/218/head
2025-12-04T11:11:09.6169495Z  * [new branch]                gh/drisspg/218/orig     -> origin/gh/drisspg/218/orig
2025-12-04T11:11:09.6169570Z  * [new branch]                gh/drisspg/219/base     -> origin/gh/drisspg/219/base
2025-12-04T11:11:09.6169643Z  * [new branch]                gh/drisspg/219/head     -> origin/gh/drisspg/219/head
2025-12-04T11:11:09.6169715Z  * [new branch]                gh/drisspg/219/orig     -> origin/gh/drisspg/219/orig
2025-12-04T11:11:09.6169786Z  * [new branch]                gh/drisspg/220/base     -> origin/gh/drisspg/220/base
2025-12-04T11:11:09.6169858Z  * [new branch]                gh/drisspg/220/head     -> origin/gh/drisspg/220/head
2025-12-04T11:11:09.6169931Z  * [new branch]                gh/drisspg/220/orig     -> origin/gh/drisspg/220/orig
2025-12-04T11:11:09.6170008Z  * [new branch]                gh/drisspg/221/base     -> origin/gh/drisspg/221/base
2025-12-04T11:11:09.6170083Z  * [new branch]                gh/drisspg/221/head     -> origin/gh/drisspg/221/head
2025-12-04T11:11:09.6170159Z  * [new branch]                gh/drisspg/221/orig     -> origin/gh/drisspg/221/orig
2025-12-04T11:11:09.6170233Z  * [new branch]                gh/drisspg/222/base     -> origin/gh/drisspg/222/base
2025-12-04T11:11:09.6170304Z  * [new branch]                gh/drisspg/222/head     -> origin/gh/drisspg/222/head
2025-12-04T11:11:09.6170380Z  * [new branch]                gh/drisspg/222/orig     -> origin/gh/drisspg/222/orig
2025-12-04T11:11:09.6170451Z  * [new branch]                gh/drisspg/223/base     -> origin/gh/drisspg/223/base
2025-12-04T11:11:09.6170523Z  * [new branch]                gh/drisspg/223/head     -> origin/gh/drisspg/223/head
2025-12-04T11:11:09.6170600Z  * [new branch]                gh/drisspg/223/orig     -> origin/gh/drisspg/223/orig
2025-12-04T11:11:09.6170674Z  * [new branch]                gh/drisspg/224/base     -> origin/gh/drisspg/224/base
2025-12-04T11:11:09.6170750Z  * [new branch]                gh/drisspg/224/head     -> origin/gh/drisspg/224/head
2025-12-04T11:11:09.6170826Z  * [new branch]                gh/drisspg/224/orig     -> origin/gh/drisspg/224/orig
2025-12-04T11:11:09.6170897Z  * [new branch]                gh/drisspg/225/base     -> origin/gh/drisspg/225/base
2025-12-04T11:11:09.6170969Z  * [new branch]                gh/drisspg/225/head     -> origin/gh/drisspg/225/head
2025-12-04T11:11:09.6171044Z  * [new branch]                gh/drisspg/225/orig     -> origin/gh/drisspg/225/orig
2025-12-04T11:11:09.6171113Z  * [new branch]                gh/drisspg/226/base     -> origin/gh/drisspg/226/base
2025-12-04T11:11:09.6171183Z  * [new branch]                gh/drisspg/226/head     -> origin/gh/drisspg/226/head
2025-12-04T11:11:09.6171260Z  * [new branch]                gh/drisspg/226/orig     -> origin/gh/drisspg/226/orig
2025-12-04T11:11:09.6171331Z  * [new branch]                gh/drisspg/227/base     -> origin/gh/drisspg/227/base
2025-12-04T11:11:09.6171406Z  * [new branch]                gh/drisspg/227/head     -> origin/gh/drisspg/227/head
2025-12-04T11:11:09.6171482Z  * [new branch]                gh/drisspg/227/orig     -> origin/gh/drisspg/227/orig
2025-12-04T11:11:09.6171554Z  * [new branch]                gh/drisspg/228/base     -> origin/gh/drisspg/228/base
2025-12-04T11:11:09.6171626Z  * [new branch]                gh/drisspg/228/head     -> origin/gh/drisspg/228/head
2025-12-04T11:11:09.6171704Z  * [new branch]                gh/drisspg/228/orig     -> origin/gh/drisspg/228/orig
2025-12-04T11:11:09.6171774Z  * [new branch]                gh/drisspg/229/base     -> origin/gh/drisspg/229/base
2025-12-04T11:11:09.6171849Z  * [new branch]                gh/drisspg/229/head     -> origin/gh/drisspg/229/head
2025-12-04T11:11:09.6171953Z  * [new branch]                gh/drisspg/229/orig     -> origin/gh/drisspg/229/orig
2025-12-04T11:11:09.6172025Z  * [new branch]                gh/drisspg/230/base     -> origin/gh/drisspg/230/base
2025-12-04T11:11:09.6172123Z  * [new branch]                gh/drisspg/230/head     -> origin/gh/drisspg/230/head
2025-12-04T11:11:09.6172194Z  * [new branch]                gh/drisspg/230/orig     -> origin/gh/drisspg/230/orig
2025-12-04T11:11:09.6172271Z  * [new branch]                gh/dsjohns2/1/base      -> origin/gh/dsjohns2/1/base
2025-12-04T11:11:09.6172350Z  * [new branch]                gh/dsjohns2/1/head      -> origin/gh/dsjohns2/1/head
2025-12-04T11:11:09.6172432Z  * [new branch]                gh/dzmitry-huba/1/base  -> origin/gh/dzmitry-huba/1/base
2025-12-04T11:11:09.6172510Z  * [new branch]                gh/dzmitry-huba/1/head  -> origin/gh/dzmitry-huba/1/head
2025-12-04T11:11:09.6172599Z  * [new branch]                gh/dzmitry-huba/12/base -> origin/gh/dzmitry-huba/12/base
2025-12-04T11:11:09.6172678Z  * [new branch]                gh/dzmitry-huba/12/head -> origin/gh/dzmitry-huba/12/head
2025-12-04T11:11:09.6172756Z  * [new branch]                gh/dzmitry-huba/12/orig -> origin/gh/dzmitry-huba/12/orig
2025-12-04T11:11:09.6172840Z  * [new branch]                gh/dzmitry-huba/13/base -> origin/gh/dzmitry-huba/13/base
2025-12-04T11:11:09.6172917Z  * [new branch]                gh/dzmitry-huba/13/head -> origin/gh/dzmitry-huba/13/head
2025-12-04T11:11:09.6172994Z  * [new branch]                gh/dzmitry-huba/13/orig -> origin/gh/dzmitry-huba/13/orig
2025-12-04T11:11:09.6173076Z  * [new branch]                gh/dzmitry-huba/14/base -> origin/gh/dzmitry-huba/14/base
2025-12-04T11:11:09.6173153Z  * [new branch]                gh/dzmitry-huba/14/head -> origin/gh/dzmitry-huba/14/head
2025-12-04T11:11:09.6173231Z  * [new branch]                gh/dzmitry-huba/14/orig -> origin/gh/dzmitry-huba/14/orig
2025-12-04T11:11:09.6173314Z  * [new branch]                gh/dzmitry-huba/15/base -> origin/gh/dzmitry-huba/15/base
2025-12-04T11:11:09.6173391Z  * [new branch]                gh/dzmitry-huba/15/head -> origin/gh/dzmitry-huba/15/head
2025-12-04T11:11:09.6173469Z  * [new branch]                gh/dzmitry-huba/15/orig -> origin/gh/dzmitry-huba/15/orig
2025-12-04T11:11:09.6173550Z  * [new branch]                gh/dzmitry-huba/16/base -> origin/gh/dzmitry-huba/16/base
2025-12-04T11:11:09.6173627Z  * [new branch]                gh/dzmitry-huba/16/head -> origin/gh/dzmitry-huba/16/head
2025-12-04T11:11:09.6173709Z  * [new branch]                gh/dzmitry-huba/16/orig -> origin/gh/dzmitry-huba/16/orig
2025-12-04T11:11:09.6173787Z  * [new branch]                gh/dzmitry-huba/17/base -> origin/gh/dzmitry-huba/17/base
2025-12-04T11:11:09.6173864Z  * [new branch]                gh/dzmitry-huba/17/head -> origin/gh/dzmitry-huba/17/head
2025-12-04T11:11:09.6173946Z  * [new branch]                gh/dzmitry-huba/17/orig -> origin/gh/dzmitry-huba/17/orig
2025-12-04T11:11:09.6174027Z  * [new branch]                gh/dzmitry-huba/2/base  -> origin/gh/dzmitry-huba/2/base
2025-12-04T11:11:09.6174103Z  * [new branch]                gh/dzmitry-huba/2/head  -> origin/gh/dzmitry-huba/2/head
2025-12-04T11:11:09.6174186Z  * [new branch]                gh/dzmitry-huba/3/base  -> origin/gh/dzmitry-huba/3/base
2025-12-04T11:11:09.6174262Z  * [new branch]                gh/dzmitry-huba/3/head  -> origin/gh/dzmitry-huba/3/head
2025-12-04T11:11:09.6174342Z  * [new branch]                gh/eellison/808/base    -> origin/gh/eellison/808/base
2025-12-04T11:11:09.6174422Z  * [new branch]                gh/eellison/808/head    -> origin/gh/eellison/808/head
2025-12-04T11:11:09.6174497Z  * [new branch]                gh/eellison/808/orig    -> origin/gh/eellison/808/orig
2025-12-04T11:11:09.6174572Z  * [new branch]                gh/eellison/822/base    -> origin/gh/eellison/822/base
2025-12-04T11:11:09.6174649Z  * [new branch]                gh/eellison/822/head    -> origin/gh/eellison/822/head
2025-12-04T11:11:09.6174745Z  * [new branch]                gh/eellison/822/orig    -> origin/gh/eellison/822/orig
2025-12-04T11:11:09.6174818Z  * [new branch]                gh/eellison/823/base    -> origin/gh/eellison/823/base
2025-12-04T11:11:09.6175135Z  * [new branch]                gh/eellison/823/head    -> origin/gh/eellison/823/head
2025-12-04T11:11:09.6175207Z  * [new branch]                gh/eellison/823/orig    -> origin/gh/eellison/823/orig
2025-12-04T11:11:09.6175280Z  * [new branch]                gh/eellison/862/base    -> origin/gh/eellison/862/base
2025-12-04T11:11:09.6175356Z  * [new branch]                gh/eellison/862/head    -> origin/gh/eellison/862/head
2025-12-04T11:11:09.6175428Z  * [new branch]                gh/eellison/862/orig    -> origin/gh/eellison/862/orig
2025-12-04T11:11:09.6175504Z  * [new branch]                gh/eellison/863/base    -> origin/gh/eellison/863/base
2025-12-04T11:11:09.6175576Z  * [new branch]                gh/eellison/863/head    -> origin/gh/eellison/863/head
2025-12-04T11:11:09.6175649Z  * [new branch]                gh/eellison/863/orig    -> origin/gh/eellison/863/orig
2025-12-04T11:11:09.6175725Z  * [new branch]                gh/eellison/864/base    -> origin/gh/eellison/864/base
2025-12-04T11:11:09.6175800Z  * [new branch]                gh/eellison/864/head    -> origin/gh/eellison/864/head
2025-12-04T11:11:09.6175872Z  * [new branch]                gh/eellison/864/orig    -> origin/gh/eellison/864/orig
2025-12-04T11:11:09.6175948Z  * [new branch]                gh/eellison/865/base    -> origin/gh/eellison/865/base
2025-12-04T11:11:09.6176021Z  * [new branch]                gh/eellison/865/head    -> origin/gh/eellison/865/head
2025-12-04T11:11:09.6176093Z  * [new branch]                gh/eellison/865/orig    -> origin/gh/eellison/865/orig
2025-12-04T11:11:09.6176170Z  * [new branch]                gh/eellison/866/base    -> origin/gh/eellison/866/base
2025-12-04T11:11:09.6176244Z  * [new branch]                gh/eellison/866/head    -> origin/gh/eellison/866/head
2025-12-04T11:11:09.6176317Z  * [new branch]                gh/eellison/866/orig    -> origin/gh/eellison/866/orig
2025-12-04T11:11:09.6176394Z  * [new branch]                gh/eellison/867/base    -> origin/gh/eellison/867/base
2025-12-04T11:11:09.6176469Z  * [new branch]                gh/eellison/867/head    -> origin/gh/eellison/867/head
2025-12-04T11:11:09.6176538Z  * [new branch]                gh/eellison/867/orig    -> origin/gh/eellison/867/orig
2025-12-04T11:11:09.6176613Z  * [new branch]                gh/eellison/868/base    -> origin/gh/eellison/868/base
2025-12-04T11:11:09.6176686Z  * [new branch]                gh/eellison/868/head    -> origin/gh/eellison/868/head
2025-12-04T11:11:09.6176759Z  * [new branch]                gh/eellison/868/orig    -> origin/gh/eellison/868/orig
2025-12-04T11:11:09.6176836Z  * [new branch]                gh/eellison/869/base    -> origin/gh/eellison/869/base
2025-12-04T11:11:09.6176910Z  * [new branch]                gh/eellison/869/head    -> origin/gh/eellison/869/head
2025-12-04T11:11:09.6176987Z  * [new branch]                gh/eellison/869/orig    -> origin/gh/eellison/869/orig
2025-12-04T11:11:09.6177059Z  * [new branch]                gh/eellison/870/base    -> origin/gh/eellison/870/base
2025-12-04T11:11:09.6177135Z  * [new branch]                gh/eellison/870/head    -> origin/gh/eellison/870/head
2025-12-04T11:11:09.6177212Z  * [new branch]                gh/eellison/870/orig    -> origin/gh/eellison/870/orig
2025-12-04T11:11:09.6177285Z  * [new branch]                gh/eellison/871/base    -> origin/gh/eellison/871/base
2025-12-04T11:11:09.6177359Z  * [new branch]                gh/eellison/871/head    -> origin/gh/eellison/871/head
2025-12-04T11:11:09.6177436Z  * [new branch]                gh/eellison/871/orig    -> origin/gh/eellison/871/orig
2025-12-04T11:11:09.6177509Z  * [new branch]                gh/eellison/872/base    -> origin/gh/eellison/872/base
2025-12-04T11:11:09.6177605Z  * [new branch]                gh/eellison/872/head    -> origin/gh/eellison/872/head
2025-12-04T11:11:09.6177682Z  * [new branch]                gh/eellison/872/orig    -> origin/gh/eellison/872/orig
2025-12-04T11:11:09.6177756Z  * [new branch]                gh/eellison/873/base    -> origin/gh/eellison/873/base
2025-12-04T11:11:09.6177849Z  * [new branch]                gh/eellison/873/head    -> origin/gh/eellison/873/head
2025-12-04T11:11:09.6177926Z  * [new branch]                gh/eellison/873/orig    -> origin/gh/eellison/873/orig
2025-12-04T11:11:09.6177996Z  * [new branch]                gh/eellison/874/base    -> origin/gh/eellison/874/base
2025-12-04T11:11:09.6178068Z  * [new branch]                gh/eellison/874/head    -> origin/gh/eellison/874/head
2025-12-04T11:11:09.6178181Z  * [new branch]                gh/eellison/874/orig    -> origin/gh/eellison/874/orig
2025-12-04T11:11:09.6178255Z  * [new branch]                gh/eellison/875/base    -> origin/gh/eellison/875/base
2025-12-04T11:11:09.6178328Z  * [new branch]                gh/eellison/875/head    -> origin/gh/eellison/875/head
2025-12-04T11:11:09.6178406Z  * [new branch]                gh/eellison/875/orig    -> origin/gh/eellison/875/orig
2025-12-04T11:11:09.6178479Z  * [new branch]                gh/eellison/876/base    -> origin/gh/eellison/876/base
2025-12-04T11:11:09.6178557Z  * [new branch]                gh/eellison/876/head    -> origin/gh/eellison/876/head
2025-12-04T11:11:09.6178636Z  * [new branch]                gh/eellison/876/orig    -> origin/gh/eellison/876/orig
2025-12-04T11:11:09.6178709Z  * [new branch]                gh/eellison/877/base    -> origin/gh/eellison/877/base
2025-12-04T11:11:09.6178788Z  * [new branch]                gh/eellison/877/head    -> origin/gh/eellison/877/head
2025-12-04T11:11:09.6178863Z  * [new branch]                gh/eellison/877/orig    -> origin/gh/eellison/877/orig
2025-12-04T11:11:09.6178937Z  * [new branch]                gh/eellison/878/base    -> origin/gh/eellison/878/base
2025-12-04T11:11:09.6179017Z  * [new branch]                gh/eellison/878/head    -> origin/gh/eellison/878/head
2025-12-04T11:11:09.6179090Z  * [new branch]                gh/eellison/878/orig    -> origin/gh/eellison/878/orig
2025-12-04T11:11:09.6179163Z  * [new branch]                gh/eellison/879/base    -> origin/gh/eellison/879/base
2025-12-04T11:11:09.6179244Z  * [new branch]                gh/eellison/879/head    -> origin/gh/eellison/879/head
2025-12-04T11:11:09.6179316Z  * [new branch]                gh/eellison/879/orig    -> origin/gh/eellison/879/orig
2025-12-04T11:11:09.6179386Z  * [new branch]                gh/eellison/880/base    -> origin/gh/eellison/880/base
2025-12-04T11:11:09.6179463Z  * [new branch]                gh/eellison/880/head    -> origin/gh/eellison/880/head
2025-12-04T11:11:09.6179536Z  * [new branch]                gh/eellison/880/orig    -> origin/gh/eellison/880/orig
2025-12-04T11:11:09.6179610Z  * [new branch]                gh/eellison/881/base    -> origin/gh/eellison/881/base
2025-12-04T11:11:09.6179688Z  * [new branch]                gh/eellison/881/head    -> origin/gh/eellison/881/head
2025-12-04T11:11:09.6179761Z  * [new branch]                gh/eellison/881/orig    -> origin/gh/eellison/881/orig
2025-12-04T11:11:09.6179835Z  * [new branch]                gh/eellison/882/base    -> origin/gh/eellison/882/base
2025-12-04T11:11:09.6179913Z  * [new branch]                gh/eellison/882/head    -> origin/gh/eellison/882/head
2025-12-04T11:11:09.6179986Z  * [new branch]                gh/eellison/882/orig    -> origin/gh/eellison/882/orig
2025-12-04T11:11:09.6180059Z  * [new branch]                gh/eellison/883/base    -> origin/gh/eellison/883/base
2025-12-04T11:11:09.6180136Z  * [new branch]                gh/eellison/883/head    -> origin/gh/eellison/883/head
2025-12-04T11:11:09.6180209Z  * [new branch]                gh/eellison/883/orig    -> origin/gh/eellison/883/orig
2025-12-04T11:11:09.6180286Z  * [new branch]                gh/eellison/884/base    -> origin/gh/eellison/884/base
2025-12-04T11:11:09.6180388Z  * [new branch]                gh/eellison/884/head    -> origin/gh/eellison/884/head
2025-12-04T11:11:09.6180462Z  * [new branch]                gh/eellison/884/orig    -> origin/gh/eellison/884/orig
2025-12-04T11:11:09.6180558Z  * [new branch]                gh/etaf/147/base        -> origin/gh/etaf/147/base
2025-12-04T11:11:09.6180626Z  * [new branch]                gh/etaf/147/head        -> origin/gh/etaf/147/head
2025-12-04T11:11:09.6180694Z  * [new branch]                gh/etaf/154/base        -> origin/gh/etaf/154/base
2025-12-04T11:11:09.6180767Z  * [new branch]                gh/etaf/154/head        -> origin/gh/etaf/154/head
2025-12-04T11:11:09.6180835Z  * [new branch]                gh/etaf/154/orig        -> origin/gh/etaf/154/orig
2025-12-04T11:11:09.6180901Z  * [new branch]                gh/etaf/156/base        -> origin/gh/etaf/156/base
2025-12-04T11:11:09.6180972Z  * [new branch]                gh/etaf/156/head        -> origin/gh/etaf/156/head
2025-12-04T11:11:09.6181042Z  * [new branch]                gh/etaf/156/orig        -> origin/gh/etaf/156/orig
2025-12-04T11:11:09.6181108Z  * [new branch]                gh/etaf/157/base        -> origin/gh/etaf/157/base
2025-12-04T11:11:09.6181181Z  * [new branch]                gh/etaf/157/head        -> origin/gh/etaf/157/head
2025-12-04T11:11:09.6181249Z  * [new branch]                gh/etaf/157/orig        -> origin/gh/etaf/157/orig
2025-12-04T11:11:09.6181317Z  * [new branch]                gh/etaf/158/base        -> origin/gh/etaf/158/base
2025-12-04T11:11:09.6181383Z  * [new branch]                gh/etaf/158/head        -> origin/gh/etaf/158/head
2025-12-04T11:11:09.6181448Z  * [new branch]                gh/etaf/158/orig        -> origin/gh/etaf/158/orig
2025-12-04T11:11:09.6181515Z  * [new branch]                gh/etaf/159/base        -> origin/gh/etaf/159/base
2025-12-04T11:11:09.6181582Z  * [new branch]                gh/etaf/159/head        -> origin/gh/etaf/159/head
2025-12-04T11:11:09.6181650Z  * [new branch]                gh/etaf/159/orig        -> origin/gh/etaf/159/orig
2025-12-04T11:11:09.6181720Z  * [new branch]                gh/etaf/160/base        -> origin/gh/etaf/160/base
2025-12-04T11:11:09.6181784Z  * [new branch]                gh/etaf/160/head        -> origin/gh/etaf/160/head
2025-12-04T11:11:09.6181850Z  * [new branch]                gh/etaf/160/orig        -> origin/gh/etaf/160/orig
2025-12-04T11:11:09.6181917Z  * [new branch]                gh/etaf/161/base        -> origin/gh/etaf/161/base
2025-12-04T11:11:09.6181983Z  * [new branch]                gh/etaf/161/head        -> origin/gh/etaf/161/head
2025-12-04T11:11:09.6182050Z  * [new branch]                gh/etaf/161/orig        -> origin/gh/etaf/161/orig
2025-12-04T11:11:09.6182119Z  * [new branch]                gh/etaf/166/base        -> origin/gh/etaf/166/base
2025-12-04T11:11:09.6182185Z  * [new branch]                gh/etaf/166/head        -> origin/gh/etaf/166/head
2025-12-04T11:11:09.6182252Z  * [new branch]                gh/etaf/166/orig        -> origin/gh/etaf/166/orig
2025-12-04T11:11:09.6182319Z  * [new branch]                gh/etaf/167/base        -> origin/gh/etaf/167/base
2025-12-04T11:11:09.6182384Z  * [new branch]                gh/etaf/167/head        -> origin/gh/etaf/167/head
2025-12-04T11:11:09.6182452Z  * [new branch]                gh/etaf/167/orig        -> origin/gh/etaf/167/orig
2025-12-04T11:11:09.6182517Z  * [new branch]                gh/etaf/168/base        -> origin/gh/etaf/168/base
2025-12-04T11:11:09.6182583Z  * [new branch]                gh/etaf/168/head        -> origin/gh/etaf/168/head
2025-12-04T11:11:09.6182652Z  * [new branch]                gh/etaf/168/orig        -> origin/gh/etaf/168/orig
2025-12-04T11:11:09.6182717Z  * [new branch]                gh/etaf/172/base        -> origin/gh/etaf/172/base
2025-12-04T11:11:09.6182781Z  * [new branch]                gh/etaf/172/head        -> origin/gh/etaf/172/head
2025-12-04T11:11:09.6182849Z  * [new branch]                gh/etaf/172/orig        -> origin/gh/etaf/172/orig
2025-12-04T11:11:09.6182943Z  * [new branch]                gh/etaf/173/base        -> origin/gh/etaf/173/base
2025-12-04T11:11:09.6183010Z  * [new branch]                gh/etaf/173/head        -> origin/gh/etaf/173/head
2025-12-04T11:11:09.6183097Z  * [new branch]                gh/etaf/173/orig        -> origin/gh/etaf/173/orig
2025-12-04T11:11:09.6183162Z  * [new branch]                gh/etaf/174/base        -> origin/gh/etaf/174/base
2025-12-04T11:11:09.6183227Z  * [new branch]                gh/etaf/174/head        -> origin/gh/etaf/174/head
2025-12-04T11:11:09.6183295Z  * [new branch]                gh/etaf/175/base        -> origin/gh/etaf/175/base
2025-12-04T11:11:09.6183361Z  * [new branch]                gh/etaf/175/head        -> origin/gh/etaf/175/head
2025-12-04T11:11:09.6183425Z  * [new branch]                gh/etaf/175/orig        -> origin/gh/etaf/175/orig
2025-12-04T11:11:09.6183491Z  * [new branch]                gh/etaf/176/base        -> origin/gh/etaf/176/base
2025-12-04T11:11:09.6183557Z  * [new branch]                gh/etaf/176/head        -> origin/gh/etaf/176/head
2025-12-04T11:11:09.6183626Z  * [new branch]                gh/etaf/176/orig        -> origin/gh/etaf/176/orig
2025-12-04T11:11:09.6183689Z  * [new branch]                gh/etaf/177/base        -> origin/gh/etaf/177/base
2025-12-04T11:11:09.6183754Z  * [new branch]                gh/etaf/177/head        -> origin/gh/etaf/177/head
2025-12-04T11:11:09.6183820Z  * [new branch]                gh/etaf/177/orig        -> origin/gh/etaf/177/orig
2025-12-04T11:11:09.6183883Z  * [new branch]                gh/etaf/178/base        -> origin/gh/etaf/178/base
2025-12-04T11:11:09.6183948Z  * [new branch]                gh/etaf/178/head        -> origin/gh/etaf/178/head
2025-12-04T11:11:09.6184013Z  * [new branch]                gh/etaf/178/orig        -> origin/gh/etaf/178/orig
2025-12-04T11:11:09.6184077Z  * [new branch]                gh/etaf/179/base        -> origin/gh/etaf/179/base
2025-12-04T11:11:09.6184143Z  * [new branch]                gh/etaf/179/head        -> origin/gh/etaf/179/head
2025-12-04T11:11:09.6184212Z  * [new branch]                gh/etaf/179/orig        -> origin/gh/etaf/179/orig
2025-12-04T11:11:09.6184276Z  * [new branch]                gh/etaf/180/base        -> origin/gh/etaf/180/base
2025-12-04T11:11:09.6184344Z  * [new branch]                gh/etaf/180/head        -> origin/gh/etaf/180/head
2025-12-04T11:11:09.6184411Z  * [new branch]                gh/etaf/180/orig        -> origin/gh/etaf/180/orig
2025-12-04T11:11:09.6184491Z  * [new branch]                gh/exclamaforte/1/base  -> origin/gh/exclamaforte/1/base
2025-12-04T11:11:09.6184568Z  * [new branch]                gh/exclamaforte/1/head  -> origin/gh/exclamaforte/1/head
2025-12-04T11:11:09.6184646Z  * [new branch]                gh/exclamaforte/2/base  -> origin/gh/exclamaforte/2/base
2025-12-04T11:11:09.6184722Z  * [new branch]                gh/exclamaforte/2/head  -> origin/gh/exclamaforte/2/head
2025-12-04T11:11:09.6184800Z  * [new branch]                gh/exclamaforte/3/base  -> origin/gh/exclamaforte/3/base
2025-12-04T11:11:09.6184881Z  * [new branch]                gh/exclamaforte/3/head  -> origin/gh/exclamaforte/3/head
2025-12-04T11:11:09.6184955Z  * [new branch]                gh/exclamaforte/4/base  -> origin/gh/exclamaforte/4/base
2025-12-04T11:11:09.6185033Z  * [new branch]                gh/exclamaforte/4/head  -> origin/gh/exclamaforte/4/head
2025-12-04T11:11:09.6185106Z  * [new branch]                gh/ezyang/2374/base     -> origin/gh/ezyang/2374/base
2025-12-04T11:11:09.6185178Z  * [new branch]                gh/ezyang/2374/head     -> origin/gh/ezyang/2374/head
2025-12-04T11:11:09.6185252Z  * [new branch]                gh/ezyang/2374/orig     -> origin/gh/ezyang/2374/orig
2025-12-04T11:11:09.6185321Z  * [new branch]                gh/ezyang/2973/base     -> origin/gh/ezyang/2973/base
2025-12-04T11:11:09.6185390Z  * [new branch]                gh/ezyang/2973/head     -> origin/gh/ezyang/2973/head
2025-12-04T11:11:09.6185482Z  * [new branch]                gh/ezyang/2973/orig     -> origin/gh/ezyang/2973/orig
2025-12-04T11:11:09.6185552Z  * [new branch]                gh/ezyang/2974/base     -> origin/gh/ezyang/2974/base
2025-12-04T11:11:09.6185623Z  * [new branch]                gh/ezyang/2974/head     -> origin/gh/ezyang/2974/head
2025-12-04T11:11:09.6185715Z  * [new branch]                gh/ezyang/2974/orig     -> origin/gh/ezyang/2974/orig
2025-12-04T11:11:09.6185786Z  * [new branch]                gh/ezyang/3131/base     -> origin/gh/ezyang/3131/base
2025-12-04T11:11:09.6185855Z  * [new branch]                gh/ezyang/3131/head     -> origin/gh/ezyang/3131/head
2025-12-04T11:11:09.6185926Z  * [new branch]                gh/ezyang/3131/orig     -> origin/gh/ezyang/3131/orig
2025-12-04T11:11:09.6185995Z  * [new branch]                gh/ezyang/3139/base     -> origin/gh/ezyang/3139/base
2025-12-04T11:11:09.6186063Z  * [new branch]                gh/ezyang/3139/head     -> origin/gh/ezyang/3139/head
2025-12-04T11:11:09.6186134Z  * [new branch]                gh/ezyang/3139/orig     -> origin/gh/ezyang/3139/orig
2025-12-04T11:11:09.6186204Z  * [new branch]                gh/ezyang/3140/base     -> origin/gh/ezyang/3140/base
2025-12-04T11:11:09.6186272Z  * [new branch]                gh/ezyang/3140/head     -> origin/gh/ezyang/3140/head
2025-12-04T11:11:09.6186345Z  * [new branch]                gh/ezyang/3140/orig     -> origin/gh/ezyang/3140/orig
2025-12-04T11:11:09.6186413Z  * [new branch]                gh/ezyang/3143/base     -> origin/gh/ezyang/3143/base
2025-12-04T11:11:09.6186482Z  * [new branch]                gh/ezyang/3143/head     -> origin/gh/ezyang/3143/head
2025-12-04T11:11:09.6186552Z  * [new branch]                gh/ezyang/3143/orig     -> origin/gh/ezyang/3143/orig
2025-12-04T11:11:09.6186621Z  * [new branch]                gh/ezyang/3144/base     -> origin/gh/ezyang/3144/base
2025-12-04T11:11:09.6186689Z  * [new branch]                gh/ezyang/3144/head     -> origin/gh/ezyang/3144/head
2025-12-04T11:11:09.6186761Z  * [new branch]                gh/ezyang/3144/orig     -> origin/gh/ezyang/3144/orig
2025-12-04T11:11:09.6186830Z  * [new branch]                gh/ezyang/3167/base     -> origin/gh/ezyang/3167/base
2025-12-04T11:11:09.6186899Z  * [new branch]                gh/ezyang/3167/head     -> origin/gh/ezyang/3167/head
2025-12-04T11:11:09.6186974Z  * [new branch]                gh/ezyang/3167/orig     -> origin/gh/ezyang/3167/orig
2025-12-04T11:11:09.6187043Z  * [new branch]                gh/ezyang/3173/base     -> origin/gh/ezyang/3173/base
2025-12-04T11:11:09.6187114Z  * [new branch]                gh/ezyang/3173/head     -> origin/gh/ezyang/3173/head
2025-12-04T11:11:09.6187182Z  * [new branch]                gh/ezyang/3173/orig     -> origin/gh/ezyang/3173/orig
2025-12-04T11:11:09.6187253Z  * [new branch]                gh/ezyang/3175/base     -> origin/gh/ezyang/3175/base
2025-12-04T11:11:09.6187324Z  * [new branch]                gh/ezyang/3175/head     -> origin/gh/ezyang/3175/head
2025-12-04T11:11:09.6187393Z  * [new branch]                gh/ezyang/3175/orig     -> origin/gh/ezyang/3175/orig
2025-12-04T11:11:09.6187461Z  * [new branch]                gh/ezyang/3182/base     -> origin/gh/ezyang/3182/base
2025-12-04T11:11:09.6187530Z  * [new branch]                gh/ezyang/3182/head     -> origin/gh/ezyang/3182/head
2025-12-04T11:11:09.6187600Z  * [new branch]                gh/ezyang/3182/orig     -> origin/gh/ezyang/3182/orig
2025-12-04T11:11:09.6187670Z  * [new branch]                gh/ezyang/3185/base     -> origin/gh/ezyang/3185/base
2025-12-04T11:11:09.6187741Z  * [new branch]                gh/ezyang/3185/head     -> origin/gh/ezyang/3185/head
2025-12-04T11:11:09.6187809Z  * [new branch]                gh/ezyang/3185/orig     -> origin/gh/ezyang/3185/orig
2025-12-04T11:11:09.6187877Z  * [new branch]                gh/ezyang/3189/base     -> origin/gh/ezyang/3189/base
2025-12-04T11:11:09.6187948Z  * [new branch]                gh/ezyang/3189/head     -> origin/gh/ezyang/3189/head
2025-12-04T11:11:09.6188034Z  * [new branch]                gh/ezyang/3189/orig     -> origin/gh/ezyang/3189/orig
2025-12-04T11:11:09.6188104Z  * [new branch]                gh/ezyang/3191/base     -> origin/gh/ezyang/3191/base
2025-12-04T11:11:09.6188226Z  * [new branch]                gh/ezyang/3191/head     -> origin/gh/ezyang/3191/head
2025-12-04T11:11:09.6188294Z  * [new branch]                gh/ezyang/3191/orig     -> origin/gh/ezyang/3191/orig
2025-12-04T11:11:09.6188364Z  * [new branch]                gh/ezyang/3192/base     -> origin/gh/ezyang/3192/base
2025-12-04T11:11:09.6188435Z  * [new branch]                gh/ezyang/3192/head     -> origin/gh/ezyang/3192/head
2025-12-04T11:11:09.6188504Z  * [new branch]                gh/ezyang/3192/orig     -> origin/gh/ezyang/3192/orig
2025-12-04T11:11:09.6188573Z  * [new branch]                gh/ezyang/3193/base     -> origin/gh/ezyang/3193/base
2025-12-04T11:11:09.6188644Z  * [new branch]                gh/ezyang/3193/head     -> origin/gh/ezyang/3193/head
2025-12-04T11:11:09.6188714Z  * [new branch]                gh/ezyang/3193/orig     -> origin/gh/ezyang/3193/orig
2025-12-04T11:11:09.6188784Z  * [new branch]                gh/ezyang/3194/base     -> origin/gh/ezyang/3194/base
2025-12-04T11:11:09.6188854Z  * [new branch]                gh/ezyang/3194/head     -> origin/gh/ezyang/3194/head
2025-12-04T11:11:09.6188922Z  * [new branch]                gh/ezyang/3194/orig     -> origin/gh/ezyang/3194/orig
2025-12-04T11:11:09.6188993Z  * [new branch]                gh/ezyang/3195/base     -> origin/gh/ezyang/3195/base
2025-12-04T11:11:09.6189061Z  * [new branch]                gh/ezyang/3195/head     -> origin/gh/ezyang/3195/head
2025-12-04T11:11:09.6189130Z  * [new branch]                gh/ezyang/3195/orig     -> origin/gh/ezyang/3195/orig
2025-12-04T11:11:09.6189201Z  * [new branch]                gh/ezyang/3196/base     -> origin/gh/ezyang/3196/base
2025-12-04T11:11:09.6189270Z  * [new branch]                gh/ezyang/3196/head     -> origin/gh/ezyang/3196/head
2025-12-04T11:11:09.6189342Z  * [new branch]                gh/ezyang/3196/orig     -> origin/gh/ezyang/3196/orig
2025-12-04T11:11:09.6189414Z  * [new branch]                gh/ezyang/3197/base     -> origin/gh/ezyang/3197/base
2025-12-04T11:11:09.6189484Z  * [new branch]                gh/ezyang/3197/head     -> origin/gh/ezyang/3197/head
2025-12-04T11:11:09.6189552Z  * [new branch]                gh/ezyang/3197/orig     -> origin/gh/ezyang/3197/orig
2025-12-04T11:11:09.6189622Z  * [new branch]                gh/ezyang/3198/base     -> origin/gh/ezyang/3198/base
2025-12-04T11:11:09.6189690Z  * [new branch]                gh/ezyang/3198/head     -> origin/gh/ezyang/3198/head
2025-12-04T11:11:09.6189758Z  * [new branch]                gh/ezyang/3198/orig     -> origin/gh/ezyang/3198/orig
2025-12-04T11:11:09.6189829Z  * [new branch]                gh/ezyang/3199/base     -> origin/gh/ezyang/3199/base
2025-12-04T11:11:09.6189897Z  * [new branch]                gh/ezyang/3199/head     -> origin/gh/ezyang/3199/head
2025-12-04T11:11:09.6189967Z  * [new branch]                gh/ezyang/3199/orig     -> origin/gh/ezyang/3199/orig
2025-12-04T11:11:09.6190038Z  * [new branch]                gh/ezyang/3200/base     -> origin/gh/ezyang/3200/base
2025-12-04T11:11:09.6190108Z  * [new branch]                gh/ezyang/3200/head     -> origin/gh/ezyang/3200/head
2025-12-04T11:11:09.6190176Z  * [new branch]                gh/ezyang/3200/orig     -> origin/gh/ezyang/3200/orig
2025-12-04T11:11:09.6190246Z  * [new branch]                gh/ezyang/3201/base     -> origin/gh/ezyang/3201/base
2025-12-04T11:11:09.6190314Z  * [new branch]                gh/ezyang/3201/head     -> origin/gh/ezyang/3201/head
2025-12-04T11:11:09.6190385Z  * [new branch]                gh/ezyang/3201/orig     -> origin/gh/ezyang/3201/orig
2025-12-04T11:11:09.6190453Z  * [new branch]                gh/ezyang/3202/base     -> origin/gh/ezyang/3202/base
2025-12-04T11:11:09.6190521Z  * [new branch]                gh/ezyang/3202/head     -> origin/gh/ezyang/3202/head
2025-12-04T11:11:09.6190623Z  * [new branch]                gh/ezyang/3202/orig     -> origin/gh/ezyang/3202/orig
2025-12-04T11:11:09.6190693Z  * [new branch]                gh/ezyang/3203/base     -> origin/gh/ezyang/3203/base
2025-12-04T11:11:09.6190792Z  * [new branch]                gh/ezyang/3203/head     -> origin/gh/ezyang/3203/head
2025-12-04T11:11:09.6190863Z  * [new branch]                gh/ezyang/3203/orig     -> origin/gh/ezyang/3203/orig
2025-12-04T11:11:09.6190932Z  * [new branch]                gh/ezyang/3204/base     -> origin/gh/ezyang/3204/base
2025-12-04T11:11:09.6191001Z  * [new branch]                gh/ezyang/3204/head     -> origin/gh/ezyang/3204/head
2025-12-04T11:11:09.6191071Z  * [new branch]                gh/ezyang/3204/orig     -> origin/gh/ezyang/3204/orig
2025-12-04T11:11:09.6191139Z  * [new branch]                gh/ezyang/3205/base     -> origin/gh/ezyang/3205/base
2025-12-04T11:11:09.6191206Z  * [new branch]                gh/ezyang/3205/head     -> origin/gh/ezyang/3205/head
2025-12-04T11:11:09.6191278Z  * [new branch]                gh/ezyang/3205/orig     -> origin/gh/ezyang/3205/orig
2025-12-04T11:11:09.6191348Z  * [new branch]                gh/ezyang/3206/base     -> origin/gh/ezyang/3206/base
2025-12-04T11:11:09.6191417Z  * [new branch]                gh/ezyang/3206/head     -> origin/gh/ezyang/3206/head
2025-12-04T11:11:09.6191488Z  * [new branch]                gh/ezyang/3206/orig     -> origin/gh/ezyang/3206/orig
2025-12-04T11:11:09.6191556Z  * [new branch]                gh/ezyang/3207/base     -> origin/gh/ezyang/3207/base
2025-12-04T11:11:09.6191626Z  * [new branch]                gh/ezyang/3207/head     -> origin/gh/ezyang/3207/head
2025-12-04T11:11:09.6191698Z  * [new branch]                gh/ezyang/3207/orig     -> origin/gh/ezyang/3207/orig
2025-12-04T11:11:09.6191766Z  * [new branch]                gh/ezyang/3208/base     -> origin/gh/ezyang/3208/base
2025-12-04T11:11:09.6191835Z  * [new branch]                gh/ezyang/3208/head     -> origin/gh/ezyang/3208/head
2025-12-04T11:11:09.6191908Z  * [new branch]                gh/ezyang/3208/orig     -> origin/gh/ezyang/3208/orig
2025-12-04T11:11:09.6191976Z  * [new branch]                gh/ezyang/3209/base     -> origin/gh/ezyang/3209/base
2025-12-04T11:11:09.6192049Z  * [new branch]                gh/ezyang/3209/head     -> origin/gh/ezyang/3209/head
2025-12-04T11:11:09.6192116Z  * [new branch]                gh/ezyang/3209/orig     -> origin/gh/ezyang/3209/orig
2025-12-04T11:11:09.6192187Z  * [new branch]                gh/fadara01/3/base      -> origin/gh/fadara01/3/base
2025-12-04T11:11:09.6192259Z  * [new branch]                gh/fadara01/3/head      -> origin/gh/fadara01/3/head
2025-12-04T11:11:09.6192329Z  * [new branch]                gh/fadara01/3/orig      -> origin/gh/fadara01/3/orig
2025-12-04T11:11:09.6192397Z  * [new branch]                gh/fadara01/5/base      -> origin/gh/fadara01/5/base
2025-12-04T11:11:09.6192470Z  * [new branch]                gh/fadara01/5/head      -> origin/gh/fadara01/5/head
2025-12-04T11:11:09.6192539Z  * [new branch]                gh/fadara01/5/orig      -> origin/gh/fadara01/5/orig
2025-12-04T11:11:09.6192607Z  * [new branch]                gh/fadara01/6/base      -> origin/gh/fadara01/6/base
2025-12-04T11:11:09.6192679Z  * [new branch]                gh/fadara01/6/head      -> origin/gh/fadara01/6/head
2025-12-04T11:11:09.6192747Z  * [new branch]                gh/fadara01/6/orig      -> origin/gh/fadara01/6/orig
2025-12-04T11:11:09.6192815Z  * [new branch]                gh/fadara01/7/base      -> origin/gh/fadara01/7/base
2025-12-04T11:11:09.6192886Z  * [new branch]                gh/fadara01/7/head      -> origin/gh/fadara01/7/head
2025-12-04T11:11:09.6192953Z  * [new branch]                gh/fadara01/7/orig      -> origin/gh/fadara01/7/orig
2025-12-04T11:11:09.6193021Z  * [new branch]                gh/fadara01/8/base      -> origin/gh/fadara01/8/base
2025-12-04T11:11:09.6193091Z  * [new branch]                gh/fadara01/8/head      -> origin/gh/fadara01/8/head
2025-12-04T11:11:09.6193181Z  * [new branch]                gh/fadara01/8/orig      -> origin/gh/fadara01/8/orig
2025-12-04T11:11:09.6193250Z  * [new branch]                gh/fadara01/9/base      -> origin/gh/fadara01/9/base
2025-12-04T11:11:09.6193342Z  * [new branch]                gh/fadara01/9/head      -> origin/gh/fadara01/9/head
2025-12-04T11:11:09.6193412Z  * [new branch]                gh/fadara01/9/orig      -> origin/gh/fadara01/9/orig
2025-12-04T11:11:09.6193481Z  * [new branch]                gh/fduwjj/182/base      -> origin/gh/fduwjj/182/base
2025-12-04T11:11:09.6193551Z  * [new branch]                gh/fduwjj/182/head      -> origin/gh/fduwjj/182/head
2025-12-04T11:11:09.6193618Z  * [new branch]                gh/fduwjj/182/orig      -> origin/gh/fduwjj/182/orig
2025-12-04T11:11:09.6193686Z  * [new branch]                gh/fduwjj/211/base      -> origin/gh/fduwjj/211/base
2025-12-04T11:11:09.6193756Z  * [new branch]                gh/fduwjj/211/head      -> origin/gh/fduwjj/211/head
2025-12-04T11:11:09.6193825Z  * [new branch]                gh/fduwjj/211/orig      -> origin/gh/fduwjj/211/orig
2025-12-04T11:11:09.6193894Z  * [new branch]                gh/fduwjj/212/base      -> origin/gh/fduwjj/212/base
2025-12-04T11:11:09.6193962Z  * [new branch]                gh/fduwjj/212/head      -> origin/gh/fduwjj/212/head
2025-12-04T11:11:09.6194031Z  * [new branch]                gh/fduwjj/212/orig      -> origin/gh/fduwjj/212/orig
2025-12-04T11:11:09.6194104Z  * [new branch]                gh/fduwjj/213/base      -> origin/gh/fduwjj/213/base
2025-12-04T11:11:09.6194171Z  * [new branch]                gh/fduwjj/213/head      -> origin/gh/fduwjj/213/head
2025-12-04T11:11:09.6194240Z  * [new branch]                gh/fduwjj/213/orig      -> origin/gh/fduwjj/213/orig
2025-12-04T11:11:09.6194309Z  * [new branch]                gh/fduwjj/226/base      -> origin/gh/fduwjj/226/base
2025-12-04T11:11:09.6194377Z  * [new branch]                gh/fduwjj/226/head      -> origin/gh/fduwjj/226/head
2025-12-04T11:11:09.6194446Z  * [new branch]                gh/fduwjj/226/orig      -> origin/gh/fduwjj/226/orig
2025-12-04T11:11:09.6194519Z  * [new branch]                gh/fduwjj/229/base      -> origin/gh/fduwjj/229/base
2025-12-04T11:11:09.6194587Z  * [new branch]                gh/fduwjj/229/head      -> origin/gh/fduwjj/229/head
2025-12-04T11:11:09.6194656Z  * [new branch]                gh/fduwjj/229/orig      -> origin/gh/fduwjj/229/orig
2025-12-04T11:11:09.6194727Z  * [new branch]                gh/fduwjj/233/base      -> origin/gh/fduwjj/233/base
2025-12-04T11:11:09.6194795Z  * [new branch]                gh/fduwjj/233/head      -> origin/gh/fduwjj/233/head
2025-12-04T11:11:09.6194862Z  * [new branch]                gh/fduwjj/233/orig      -> origin/gh/fduwjj/233/orig
2025-12-04T11:11:09.6194935Z  * [new branch]                gh/fduwjj/234/base      -> origin/gh/fduwjj/234/base
2025-12-04T11:11:09.6195005Z  * [new branch]                gh/fduwjj/234/head      -> origin/gh/fduwjj/234/head
2025-12-04T11:11:09.6195075Z  * [new branch]                gh/fduwjj/234/orig      -> origin/gh/fduwjj/234/orig
2025-12-04T11:11:09.6195148Z  * [new branch]                gh/fduwjj/235/base      -> origin/gh/fduwjj/235/base
2025-12-04T11:11:09.6195218Z  * [new branch]                gh/fduwjj/235/head      -> origin/gh/fduwjj/235/head
2025-12-04T11:11:09.6195289Z  * [new branch]                gh/fduwjj/235/orig      -> origin/gh/fduwjj/235/orig
2025-12-04T11:11:09.6195363Z  * [new branch]                gh/fduwjj/236/base      -> origin/gh/fduwjj/236/base
2025-12-04T11:11:09.6195432Z  * [new branch]                gh/fduwjj/236/head      -> origin/gh/fduwjj/236/head
2025-12-04T11:11:09.6195507Z  * [new branch]                gh/fduwjj/236/orig      -> origin/gh/fduwjj/236/orig
2025-12-04T11:11:09.6195575Z  * [new branch]                gh/fduwjj/237/base      -> origin/gh/fduwjj/237/base
2025-12-04T11:11:09.6195644Z  * [new branch]                gh/fduwjj/237/head      -> origin/gh/fduwjj/237/head
2025-12-04T11:11:09.6195735Z  * [new branch]                gh/fduwjj/237/orig      -> origin/gh/fduwjj/237/orig
2025-12-04T11:11:09.6195806Z  * [new branch]                gh/fduwjj/238/base      -> origin/gh/fduwjj/238/base
2025-12-04T11:11:09.6195874Z  * [new branch]                gh/fduwjj/238/head      -> origin/gh/fduwjj/238/head
2025-12-04T11:11:09.6195970Z  * [new branch]                gh/fduwjj/238/orig      -> origin/gh/fduwjj/238/orig
2025-12-04T11:11:09.6196040Z  * [new branch]                gh/fduwjj/239/base      -> origin/gh/fduwjj/239/base
2025-12-04T11:11:09.6196108Z  * [new branch]                gh/fduwjj/239/head      -> origin/gh/fduwjj/239/head
2025-12-04T11:11:09.6196181Z  * [new branch]                gh/fduwjj/239/orig      -> origin/gh/fduwjj/239/orig
2025-12-04T11:11:09.6196254Z  * [new branch]                gh/fegin/332/base       -> origin/gh/fegin/332/base
2025-12-04T11:11:09.6196323Z  * [new branch]                gh/fegin/332/head       -> origin/gh/fegin/332/head
2025-12-04T11:11:09.6196398Z  * [new branch]                gh/fegin/332/orig       -> origin/gh/fegin/332/orig
2025-12-04T11:11:09.6196467Z  * [new branch]                gh/fegin/333/base       -> origin/gh/fegin/333/base
2025-12-04T11:11:09.6196533Z  * [new branch]                gh/fegin/333/head       -> origin/gh/fegin/333/head
2025-12-04T11:11:09.6196603Z  * [new branch]                gh/fegin/333/orig       -> origin/gh/fegin/333/orig
2025-12-04T11:11:09.6196670Z  * [new branch]                gh/fegin/334/base       -> origin/gh/fegin/334/base
2025-12-04T11:11:09.6196738Z  * [new branch]                gh/fegin/334/head       -> origin/gh/fegin/334/head
2025-12-04T11:11:09.6196807Z  * [new branch]                gh/fegin/334/orig       -> origin/gh/fegin/334/orig
2025-12-04T11:11:09.6196876Z  * [new branch]                gh/fegin/335/base       -> origin/gh/fegin/335/base
2025-12-04T11:11:09.6196943Z  * [new branch]                gh/fegin/335/head       -> origin/gh/fegin/335/head
2025-12-04T11:11:09.6197016Z  * [new branch]                gh/fegin/335/orig       -> origin/gh/fegin/335/orig
2025-12-04T11:11:09.6197086Z  * [new branch]                gh/fffrog/160/base      -> origin/gh/fffrog/160/base
2025-12-04T11:11:09.6197160Z  * [new branch]                gh/fffrog/160/head      -> origin/gh/fffrog/160/head
2025-12-04T11:11:09.6197232Z  * [new branch]                gh/fffrog/177/base      -> origin/gh/fffrog/177/base
2025-12-04T11:11:09.6197301Z  * [new branch]                gh/fffrog/177/head      -> origin/gh/fffrog/177/head
2025-12-04T11:11:09.6197373Z  * [new branch]                gh/fffrog/177/orig      -> origin/gh/fffrog/177/orig
2025-12-04T11:11:09.6197442Z  * [new branch]                gh/fffrog/178/base      -> origin/gh/fffrog/178/base
2025-12-04T11:11:09.6197511Z  * [new branch]                gh/fffrog/178/head      -> origin/gh/fffrog/178/head
2025-12-04T11:11:09.6197580Z  * [new branch]                gh/fffrog/178/orig      -> origin/gh/fffrog/178/orig
2025-12-04T11:11:09.6197649Z  * [new branch]                gh/fffrog/181/base      -> origin/gh/fffrog/181/base
2025-12-04T11:11:09.6197718Z  * [new branch]                gh/fffrog/181/head      -> origin/gh/fffrog/181/head
2025-12-04T11:11:09.6197788Z  * [new branch]                gh/fffrog/181/orig      -> origin/gh/fffrog/181/orig
2025-12-04T11:11:09.6197859Z  * [new branch]                gh/fffrog/183/base      -> origin/gh/fffrog/183/base
2025-12-04T11:11:09.6197926Z  * [new branch]                gh/fffrog/183/head      -> origin/gh/fffrog/183/head
2025-12-04T11:11:09.6197999Z  * [new branch]                gh/fffrog/183/orig      -> origin/gh/fffrog/183/orig
2025-12-04T11:11:09.6198068Z  * [new branch]                gh/fxdawnn/10/base      -> origin/gh/fxdawnn/10/base
2025-12-04T11:11:09.6198137Z  * [new branch]                gh/fxdawnn/10/head      -> origin/gh/fxdawnn/10/head
2025-12-04T11:11:09.6198244Z  * [new branch]                gh/fxdawnn/10/orig      -> origin/gh/fxdawnn/10/orig
2025-12-04T11:11:09.6198313Z  * [new branch]                gh/fxdawnn/11/base      -> origin/gh/fxdawnn/11/base
2025-12-04T11:11:09.6198419Z  * [new branch]                gh/fxdawnn/11/head      -> origin/gh/fxdawnn/11/head
2025-12-04T11:11:09.6198493Z  * [new branch]                gh/fxdawnn/11/orig      -> origin/gh/fxdawnn/11/orig
2025-12-04T11:11:09.6198817Z  * [new branch]                gh/fxdawnn/12/base      -> origin/gh/fxdawnn/12/base
2025-12-04T11:11:09.6198888Z  * [new branch]                gh/fxdawnn/12/head      -> origin/gh/fxdawnn/12/head
2025-12-04T11:11:09.6198956Z  * [new branch]                gh/fxdawnn/12/orig      -> origin/gh/fxdawnn/12/orig
2025-12-04T11:11:09.6199025Z  * [new branch]                gh/fxdawnn/13/base      -> origin/gh/fxdawnn/13/base
2025-12-04T11:11:09.6199097Z  * [new branch]                gh/fxdawnn/13/head      -> origin/gh/fxdawnn/13/head
2025-12-04T11:11:09.6199166Z  * [new branch]                gh/fxdawnn/13/orig      -> origin/gh/fxdawnn/13/orig
2025-12-04T11:11:09.6199233Z  * [new branch]                gh/fxdawnn/14/base      -> origin/gh/fxdawnn/14/base
2025-12-04T11:11:09.6199310Z  * [new branch]                gh/fxdawnn/14/head      -> origin/gh/fxdawnn/14/head
2025-12-04T11:11:09.6199378Z  * [new branch]                gh/fxdawnn/14/orig      -> origin/gh/fxdawnn/14/orig
2025-12-04T11:11:09.6199448Z  * [new branch]                gh/fxdawnn/15/base      -> origin/gh/fxdawnn/15/base
2025-12-04T11:11:09.6199518Z  * [new branch]                gh/fxdawnn/15/head      -> origin/gh/fxdawnn/15/head
2025-12-04T11:11:09.6199585Z  * [new branch]                gh/fxdawnn/15/orig      -> origin/gh/fxdawnn/15/orig
2025-12-04T11:11:09.6199655Z  * [new branch]                gh/fxdawnn/6/base       -> origin/gh/fxdawnn/6/base
2025-12-04T11:11:09.6199726Z  * [new branch]                gh/fxdawnn/6/head       -> origin/gh/fxdawnn/6/head
2025-12-04T11:11:09.6199793Z  * [new branch]                gh/fxdawnn/6/orig       -> origin/gh/fxdawnn/6/orig
2025-12-04T11:11:09.6199861Z  * [new branch]                gh/fxdawnn/7/base       -> origin/gh/fxdawnn/7/base
2025-12-04T11:11:09.6199937Z  * [new branch]                gh/fxdawnn/7/head       -> origin/gh/fxdawnn/7/head
2025-12-04T11:11:09.6200005Z  * [new branch]                gh/fxdawnn/7/orig       -> origin/gh/fxdawnn/7/orig
2025-12-04T11:11:09.6200074Z  * [new branch]                gh/fxdawnn/9/base       -> origin/gh/fxdawnn/9/base
2025-12-04T11:11:09.6200143Z  * [new branch]                gh/fxdawnn/9/head       -> origin/gh/fxdawnn/9/head
2025-12-04T11:11:09.6200210Z  * [new branch]                gh/fxdawnn/9/orig       -> origin/gh/fxdawnn/9/orig
2025-12-04T11:11:09.6200278Z  * [new branch]                gh/galv/1/base          -> origin/gh/galv/1/base
2025-12-04T11:11:09.6200345Z  * [new branch]                gh/galv/1/head          -> origin/gh/galv/1/head
2025-12-04T11:11:09.6200411Z  * [new branch]                gh/galv/1/orig          -> origin/gh/galv/1/orig
2025-12-04T11:11:09.6200476Z  * [new branch]                gh/galv/2/base          -> origin/gh/galv/2/base
2025-12-04T11:11:09.6200547Z  * [new branch]                gh/galv/2/head          -> origin/gh/galv/2/head
2025-12-04T11:11:09.6200611Z  * [new branch]                gh/galv/2/orig          -> origin/gh/galv/2/orig
2025-12-04T11:11:09.6200678Z  * [new branch]                gh/galv/3/base          -> origin/gh/galv/3/base
2025-12-04T11:11:09.6200744Z  * [new branch]                gh/galv/3/head          -> origin/gh/galv/3/head
2025-12-04T11:11:09.6200809Z  * [new branch]                gh/galv/3/orig          -> origin/gh/galv/3/orig
2025-12-04T11:11:09.6200892Z  * [new branch]                gh/guangyey/134/base    -> origin/gh/guangyey/134/base
2025-12-04T11:11:09.6200969Z  * [new branch]                gh/guangyey/134/head    -> origin/gh/guangyey/134/head
2025-12-04T11:11:09.6201043Z  * [new branch]                gh/guangyey/134/orig    -> origin/gh/guangyey/134/orig
2025-12-04T11:11:09.6201118Z  * [new branch]                gh/guangyey/163/base    -> origin/gh/guangyey/163/base
2025-12-04T11:11:09.6201209Z  * [new branch]                gh/guangyey/163/head    -> origin/gh/guangyey/163/head
2025-12-04T11:11:09.6201281Z  * [new branch]                gh/guangyey/163/orig    -> origin/gh/guangyey/163/orig
2025-12-04T11:11:09.6201354Z  * [new branch]                gh/guangyey/168/base    -> origin/gh/guangyey/168/base
2025-12-04T11:11:09.6201453Z  * [new branch]                gh/guangyey/168/head    -> origin/gh/guangyey/168/head
2025-12-04T11:11:09.6201524Z  * [new branch]                gh/guangyey/168/orig    -> origin/gh/guangyey/168/orig
2025-12-04T11:11:09.6201598Z  * [new branch]                gh/guangyey/169/base    -> origin/gh/guangyey/169/base
2025-12-04T11:11:09.6201668Z  * [new branch]                gh/guangyey/169/head    -> origin/gh/guangyey/169/head
2025-12-04T11:11:09.6201739Z  * [new branch]                gh/guangyey/169/orig    -> origin/gh/guangyey/169/orig
2025-12-04T11:11:09.6201813Z  * [new branch]                gh/guangyey/170/base    -> origin/gh/guangyey/170/base
2025-12-04T11:11:09.6201885Z  * [new branch]                gh/guangyey/170/head    -> origin/gh/guangyey/170/head
2025-12-04T11:11:09.6201955Z  * [new branch]                gh/guangyey/170/orig    -> origin/gh/guangyey/170/orig
2025-12-04T11:11:09.6202031Z  * [new branch]                gh/guangyey/171/base    -> origin/gh/guangyey/171/base
2025-12-04T11:11:09.6202103Z  * [new branch]                gh/guangyey/171/head    -> origin/gh/guangyey/171/head
2025-12-04T11:11:09.6202175Z  * [new branch]                gh/guangyey/171/orig    -> origin/gh/guangyey/171/orig
2025-12-04T11:11:09.6202250Z  * [new branch]                gh/guangyey/178/base    -> origin/gh/guangyey/178/base
2025-12-04T11:11:09.6202322Z  * [new branch]                gh/guangyey/178/head    -> origin/gh/guangyey/178/head
2025-12-04T11:11:09.6202397Z  * [new branch]                gh/guangyey/178/orig    -> origin/gh/guangyey/178/orig
2025-12-04T11:11:09.6202468Z  * [new branch]                gh/guangyey/182/base    -> origin/gh/guangyey/182/base
2025-12-04T11:11:09.6202539Z  * [new branch]                gh/guangyey/182/head    -> origin/gh/guangyey/182/head
2025-12-04T11:11:09.6202611Z  * [new branch]                gh/guangyey/182/orig    -> origin/gh/guangyey/182/orig
2025-12-04T11:11:09.6202683Z  * [new branch]                gh/guangyey/183/base    -> origin/gh/guangyey/183/base
2025-12-04T11:11:09.6202753Z  * [new branch]                gh/guangyey/183/head    -> origin/gh/guangyey/183/head
2025-12-04T11:11:09.6202826Z  * [new branch]                gh/guangyey/183/orig    -> origin/gh/guangyey/183/orig
2025-12-04T11:11:09.6202897Z  * [new branch]                gh/guangyey/185/base    -> origin/gh/guangyey/185/base
2025-12-04T11:11:09.6202967Z  * [new branch]                gh/guangyey/185/head    -> origin/gh/guangyey/185/head
2025-12-04T11:11:09.6203039Z  * [new branch]                gh/guangyey/185/orig    -> origin/gh/guangyey/185/orig
2025-12-04T11:11:09.6203110Z  * [new branch]                gh/guangyey/186/base    -> origin/gh/guangyey/186/base
2025-12-04T11:11:09.6203184Z  * [new branch]                gh/guangyey/186/head    -> origin/gh/guangyey/186/head
2025-12-04T11:11:09.6203257Z  * [new branch]                gh/guangyey/186/orig    -> origin/gh/guangyey/186/orig
2025-12-04T11:11:09.6203329Z  * [new branch]                gh/guangyey/187/base    -> origin/gh/guangyey/187/base
2025-12-04T11:11:09.6203399Z  * [new branch]                gh/guangyey/187/head    -> origin/gh/guangyey/187/head
2025-12-04T11:11:09.6203470Z  * [new branch]                gh/guangyey/187/orig    -> origin/gh/guangyey/187/orig
2025-12-04T11:11:09.6203539Z  * [new branch]                gh/guangyey/188/base    -> origin/gh/guangyey/188/base
2025-12-04T11:11:09.6203609Z  * [new branch]                gh/guangyey/188/head    -> origin/gh/guangyey/188/head
2025-12-04T11:11:09.6203680Z  * [new branch]                gh/guangyey/188/orig    -> origin/gh/guangyey/188/orig
2025-12-04T11:11:09.6203750Z  * [new branch]                gh/guangyey/190/base    -> origin/gh/guangyey/190/base
2025-12-04T11:11:09.6203842Z  * [new branch]                gh/guangyey/190/head    -> origin/gh/guangyey/190/head
2025-12-04T11:11:09.6203912Z  * [new branch]                gh/guangyey/190/orig    -> origin/gh/guangyey/190/orig
2025-12-04T11:11:09.6204036Z  * [new branch]                gh/guangyey/208/base    -> origin/gh/guangyey/208/base
2025-12-04T11:11:09.6204110Z  * [new branch]                gh/guangyey/208/head    -> origin/gh/guangyey/208/head
2025-12-04T11:11:09.6204181Z  * [new branch]                gh/guangyey/208/orig    -> origin/gh/guangyey/208/orig
2025-12-04T11:11:09.6204251Z  * [new branch]                gh/guangyey/228/base    -> origin/gh/guangyey/228/base
2025-12-04T11:11:09.6204324Z  * [new branch]                gh/guangyey/228/head    -> origin/gh/guangyey/228/head
2025-12-04T11:11:09.6204395Z  * [new branch]                gh/guangyey/228/orig    -> origin/gh/guangyey/228/orig
2025-12-04T11:11:09.6204466Z  * [new branch]                gh/guangyey/230/base    -> origin/gh/guangyey/230/base
2025-12-04T11:11:09.6204541Z  * [new branch]                gh/guangyey/230/head    -> origin/gh/guangyey/230/head
2025-12-04T11:11:09.6204613Z  * [new branch]                gh/guangyey/230/orig    -> origin/gh/guangyey/230/orig
2025-12-04T11:11:09.6204688Z  * [new branch]                gh/guangyey/231/base    -> origin/gh/guangyey/231/base
2025-12-04T11:11:09.6204763Z  * [new branch]                gh/guangyey/231/head    -> origin/gh/guangyey/231/head
2025-12-04T11:11:09.6204836Z  * [new branch]                gh/guangyey/231/orig    -> origin/gh/guangyey/231/orig
2025-12-04T11:11:09.6204908Z  * [new branch]                gh/guangyey/232/base    -> origin/gh/guangyey/232/base
2025-12-04T11:11:09.6204985Z  * [new branch]                gh/guangyey/232/head    -> origin/gh/guangyey/232/head
2025-12-04T11:11:09.6205057Z  * [new branch]                gh/guangyey/232/orig    -> origin/gh/guangyey/232/orig
2025-12-04T11:11:09.6205128Z  * [new branch]                gh/guangyey/233/base    -> origin/gh/guangyey/233/base
2025-12-04T11:11:09.6205206Z  * [new branch]                gh/guangyey/233/head    -> origin/gh/guangyey/233/head
2025-12-04T11:11:09.6205279Z  * [new branch]                gh/guangyey/233/orig    -> origin/gh/guangyey/233/orig
2025-12-04T11:11:09.6205357Z  * [new branch]                gh/guangyey/234/base    -> origin/gh/guangyey/234/base
2025-12-04T11:11:09.6205429Z  * [new branch]                gh/guangyey/234/head    -> origin/gh/guangyey/234/head
2025-12-04T11:11:09.6205503Z  * [new branch]                gh/guangyey/234/orig    -> origin/gh/guangyey/234/orig
2025-12-04T11:11:09.6205579Z  * [new branch]                gh/guangyey/235/base    -> origin/gh/guangyey/235/base
2025-12-04T11:11:09.6205652Z  * [new branch]                gh/guangyey/235/head    -> origin/gh/guangyey/235/head
2025-12-04T11:11:09.6205723Z  * [new branch]                gh/guangyey/235/orig    -> origin/gh/guangyey/235/orig
2025-12-04T11:11:09.6205798Z  * [new branch]                gh/guangyey/236/base    -> origin/gh/guangyey/236/base
2025-12-04T11:11:09.6205868Z  * [new branch]                gh/guangyey/236/head    -> origin/gh/guangyey/236/head
2025-12-04T11:11:09.6205939Z  * [new branch]                gh/guangyey/236/orig    -> origin/gh/guangyey/236/orig
2025-12-04T11:11:09.6206015Z  * [new branch]                gh/guangyey/237/base    -> origin/gh/guangyey/237/base
2025-12-04T11:11:09.6206086Z  * [new branch]                gh/guangyey/237/head    -> origin/gh/guangyey/237/head
2025-12-04T11:11:09.6206159Z  * [new branch]                gh/guangyey/237/orig    -> origin/gh/guangyey/237/orig
2025-12-04T11:11:09.6206232Z  * [new branch]                gh/guangyey/238/base    -> origin/gh/guangyey/238/base
2025-12-04T11:11:09.6206304Z  * [new branch]                gh/guangyey/238/head    -> origin/gh/guangyey/238/head
2025-12-04T11:11:09.6206375Z  * [new branch]                gh/guangyey/239/base    -> origin/gh/guangyey/239/base
2025-12-04T11:11:09.6206485Z  * [new branch]                gh/guangyey/239/head    -> origin/gh/guangyey/239/head
2025-12-04T11:11:09.6206557Z  * [new branch]                gh/guangyey/239/orig    -> origin/gh/guangyey/239/orig
2025-12-04T11:11:09.6206629Z  * [new branch]                gh/guangyey/240/base    -> origin/gh/guangyey/240/base
2025-12-04T11:11:09.6206728Z  * [new branch]                gh/guangyey/240/head    -> origin/gh/guangyey/240/head
2025-12-04T11:11:09.6206800Z  * [new branch]                gh/guangyey/240/orig    -> origin/gh/guangyey/240/orig
2025-12-04T11:11:09.6206872Z  * [new branch]                gh/guangyey/241/base    -> origin/gh/guangyey/241/base
2025-12-04T11:11:09.6206946Z  * [new branch]                gh/guangyey/241/head    -> origin/gh/guangyey/241/head
2025-12-04T11:11:09.6207017Z  * [new branch]                gh/guangyey/241/orig    -> origin/gh/guangyey/241/orig
2025-12-04T11:11:09.6207091Z  * [new branch]                gh/guangyey/242/base    -> origin/gh/guangyey/242/base
2025-12-04T11:11:09.6207163Z  * [new branch]                gh/guangyey/242/head    -> origin/gh/guangyey/242/head
2025-12-04T11:11:09.6207235Z  * [new branch]                gh/guangyey/242/orig    -> origin/gh/guangyey/242/orig
2025-12-04T11:11:09.6207307Z  * [new branch]                gh/guangyey/243/base    -> origin/gh/guangyey/243/base
2025-12-04T11:11:09.6207379Z  * [new branch]                gh/guangyey/243/head    -> origin/gh/guangyey/243/head
2025-12-04T11:11:09.6207451Z  * [new branch]                gh/guangyey/243/orig    -> origin/gh/guangyey/243/orig
2025-12-04T11:11:09.6207526Z  * [new branch]                gh/guangyey/244/base    -> origin/gh/guangyey/244/base
2025-12-04T11:11:09.6207598Z  * [new branch]                gh/guangyey/244/head    -> origin/gh/guangyey/244/head
2025-12-04T11:11:09.6207668Z  * [new branch]                gh/guangyey/244/orig    -> origin/gh/guangyey/244/orig
2025-12-04T11:11:09.6207741Z  * [new branch]                gh/guangyey/245/base    -> origin/gh/guangyey/245/base
2025-12-04T11:11:09.6207813Z  * [new branch]                gh/guangyey/245/head    -> origin/gh/guangyey/245/head
2025-12-04T11:11:09.6207884Z  * [new branch]                gh/guangyey/245/orig    -> origin/gh/guangyey/245/orig
2025-12-04T11:11:09.6207959Z  * [new branch]                gh/guangyey/246/base    -> origin/gh/guangyey/246/base
2025-12-04T11:11:09.6208031Z  * [new branch]                gh/guangyey/246/head    -> origin/gh/guangyey/246/head
2025-12-04T11:11:09.6208103Z  * [new branch]                gh/guangyey/246/orig    -> origin/gh/guangyey/246/orig
2025-12-04T11:11:09.6208227Z  * [new branch]                gh/guangyey/247/base    -> origin/gh/guangyey/247/base
2025-12-04T11:11:09.6208301Z  * [new branch]                gh/guangyey/247/head    -> origin/gh/guangyey/247/head
2025-12-04T11:11:09.6208375Z  * [new branch]                gh/guangyey/247/orig    -> origin/gh/guangyey/247/orig
2025-12-04T11:11:09.6208452Z  * [new branch]                gh/guangyey/248/base    -> origin/gh/guangyey/248/base
2025-12-04T11:11:09.6208526Z  * [new branch]                gh/guangyey/248/head    -> origin/gh/guangyey/248/head
2025-12-04T11:11:09.6208603Z  * [new branch]                gh/guangyey/248/orig    -> origin/gh/guangyey/248/orig
2025-12-04T11:11:09.6208676Z  * [new branch]                gh/guangyey/249/base    -> origin/gh/guangyey/249/base
2025-12-04T11:11:09.6208750Z  * [new branch]                gh/guangyey/249/head    -> origin/gh/guangyey/249/head
2025-12-04T11:11:09.6208827Z  * [new branch]                gh/guangyey/249/orig    -> origin/gh/guangyey/249/orig
2025-12-04T11:11:09.6208900Z  * [new branch]                gh/guangyey/250/base    -> origin/gh/guangyey/250/base
2025-12-04T11:11:09.6208975Z  * [new branch]                gh/guangyey/250/head    -> origin/gh/guangyey/250/head
2025-12-04T11:11:09.6209051Z  * [new branch]                gh/guangyey/250/orig    -> origin/gh/guangyey/250/orig
2025-12-04T11:11:09.6209124Z  * [new branch]                gh/guangyey/251/base    -> origin/gh/guangyey/251/base
2025-12-04T11:11:09.6209221Z  * [new branch]                gh/guangyey/251/head    -> origin/gh/guangyey/251/head
2025-12-04T11:11:09.6209299Z  * [new branch]                gh/guangyey/251/orig    -> origin/gh/guangyey/251/orig
2025-12-04T11:11:09.6209399Z  * [new branch]                gh/guangyey/252/base    -> origin/gh/guangyey/252/base
2025-12-04T11:11:09.6209472Z  * [new branch]                gh/guangyey/252/head    -> origin/gh/guangyey/252/head
2025-12-04T11:11:09.6209551Z  * [new branch]                gh/guangyey/252/orig    -> origin/gh/guangyey/252/orig
2025-12-04T11:11:09.6209625Z  * [new branch]                gh/guangyey/253/base    -> origin/gh/guangyey/253/base
2025-12-04T11:11:09.6209698Z  * [new branch]                gh/guangyey/253/head    -> origin/gh/guangyey/253/head
2025-12-04T11:11:09.6209775Z  * [new branch]                gh/guangyey/253/orig    -> origin/gh/guangyey/253/orig
2025-12-04T11:11:09.6209848Z  * [new branch]                gh/guangyey/254/base    -> origin/gh/guangyey/254/base
2025-12-04T11:11:09.6209921Z  * [new branch]                gh/guangyey/254/head    -> origin/gh/guangyey/254/head
2025-12-04T11:11:09.6210001Z  * [new branch]                gh/guangyey/254/orig    -> origin/gh/guangyey/254/orig
2025-12-04T11:11:09.6210076Z  * [new branch]                gh/guangyey/255/base    -> origin/gh/guangyey/255/base
2025-12-04T11:11:09.6210153Z  * [new branch]                gh/guangyey/255/head    -> origin/gh/guangyey/255/head
2025-12-04T11:11:09.6210226Z  * [new branch]                gh/guangyey/255/orig    -> origin/gh/guangyey/255/orig
2025-12-04T11:11:09.6210299Z  * [new branch]                gh/guangyey/256/base    -> origin/gh/guangyey/256/base
2025-12-04T11:11:09.6210378Z  * [new branch]                gh/guangyey/256/head    -> origin/gh/guangyey/256/head
2025-12-04T11:11:09.6210448Z  * [new branch]                gh/guangyey/256/orig    -> origin/gh/guangyey/256/orig
2025-12-04T11:11:09.6210550Z  * [new branch]                gh/guilhermeleobas/107/base -> origin/gh/guilhermeleobas/107/base
2025-12-04T11:11:09.6210653Z  * [new branch]                gh/guilhermeleobas/107/head -> origin/gh/guilhermeleobas/107/head
2025-12-04T11:11:09.6210747Z  * [new branch]                gh/guilhermeleobas/107/orig -> origin/gh/guilhermeleobas/107/orig
2025-12-04T11:11:09.6210840Z  * [new branch]                gh/guilhermeleobas/108/base -> origin/gh/guilhermeleobas/108/base
2025-12-04T11:11:09.6210935Z  * [new branch]                gh/guilhermeleobas/108/head -> origin/gh/guilhermeleobas/108/head
2025-12-04T11:11:09.6211025Z  * [new branch]                gh/guilhermeleobas/108/orig -> origin/gh/guilhermeleobas/108/orig
2025-12-04T11:11:09.6211116Z  * [new branch]                gh/guilhermeleobas/150/base -> origin/gh/guilhermeleobas/150/base
2025-12-04T11:11:09.6211210Z  * [new branch]                gh/guilhermeleobas/150/head -> origin/gh/guilhermeleobas/150/head
2025-12-04T11:11:09.6211301Z  * [new branch]                gh/guilhermeleobas/150/orig -> origin/gh/guilhermeleobas/150/orig
2025-12-04T11:11:09.6211392Z  * [new branch]                gh/guilhermeleobas/168/base -> origin/gh/guilhermeleobas/168/base
2025-12-04T11:11:09.6211487Z  * [new branch]                gh/guilhermeleobas/168/head -> origin/gh/guilhermeleobas/168/head
2025-12-04T11:11:09.6211578Z  * [new branch]                gh/guilhermeleobas/168/orig -> origin/gh/guilhermeleobas/168/orig
2025-12-04T11:11:09.6211673Z  * [new branch]                gh/guilhermeleobas/169/base -> origin/gh/guilhermeleobas/169/base
2025-12-04T11:11:09.6211763Z  * [new branch]                gh/guilhermeleobas/169/head -> origin/gh/guilhermeleobas/169/head
2025-12-04T11:11:09.6211853Z  * [new branch]                gh/guilhermeleobas/169/orig -> origin/gh/guilhermeleobas/169/orig
2025-12-04T11:11:09.6211945Z  * [new branch]                gh/guilhermeleobas/170/base -> origin/gh/guilhermeleobas/170/base
2025-12-04T11:11:09.6212035Z  * [new branch]                gh/guilhermeleobas/170/head -> origin/gh/guilhermeleobas/170/head
2025-12-04T11:11:09.6212151Z  * [new branch]                gh/guilhermeleobas/170/orig -> origin/gh/guilhermeleobas/170/orig
2025-12-04T11:11:09.6212246Z  * [new branch]                gh/guilhermeleobas/171/base -> origin/gh/guilhermeleobas/171/base
2025-12-04T11:11:09.6212359Z  * [new branch]                gh/guilhermeleobas/171/head -> origin/gh/guilhermeleobas/171/head
2025-12-04T11:11:09.6212449Z  * [new branch]                gh/guilhermeleobas/171/orig -> origin/gh/guilhermeleobas/171/orig
2025-12-04T11:11:09.6212544Z  * [new branch]                gh/guilhermeleobas/173/base -> origin/gh/guilhermeleobas/173/base
2025-12-04T11:11:09.6212634Z  * [new branch]                gh/guilhermeleobas/173/head -> origin/gh/guilhermeleobas/173/head
2025-12-04T11:11:09.6212723Z  * [new branch]                gh/guilhermeleobas/173/orig -> origin/gh/guilhermeleobas/173/orig
2025-12-04T11:11:09.6212817Z  * [new branch]                gh/guilhermeleobas/193/base -> origin/gh/guilhermeleobas/193/base
2025-12-04T11:11:09.6212909Z  * [new branch]                gh/guilhermeleobas/193/head -> origin/gh/guilhermeleobas/193/head
2025-12-04T11:11:09.6212998Z  * [new branch]                gh/guilhermeleobas/193/orig -> origin/gh/guilhermeleobas/193/orig
2025-12-04T11:11:09.6213094Z  * [new branch]                gh/guilhermeleobas/204/base -> origin/gh/guilhermeleobas/204/base
2025-12-04T11:11:09.6213184Z  * [new branch]                gh/guilhermeleobas/204/head -> origin/gh/guilhermeleobas/204/head
2025-12-04T11:11:09.6213279Z  * [new branch]                gh/guilhermeleobas/204/orig -> origin/gh/guilhermeleobas/204/orig
2025-12-04T11:11:09.6213371Z  * [new branch]                gh/guilhermeleobas/211/base -> origin/gh/guilhermeleobas/211/base
2025-12-04T11:11:09.6213458Z  * [new branch]                gh/guilhermeleobas/211/head -> origin/gh/guilhermeleobas/211/head
2025-12-04T11:11:09.6213552Z  * [new branch]                gh/guilhermeleobas/211/orig -> origin/gh/guilhermeleobas/211/orig
2025-12-04T11:11:09.6213644Z  * [new branch]                gh/guilhermeleobas/226/base -> origin/gh/guilhermeleobas/226/base
2025-12-04T11:11:09.6213735Z  * [new branch]                gh/guilhermeleobas/226/head -> origin/gh/guilhermeleobas/226/head
2025-12-04T11:11:09.6213830Z  * [new branch]                gh/guilhermeleobas/226/orig -> origin/gh/guilhermeleobas/226/orig
2025-12-04T11:11:09.6213921Z  * [new branch]                gh/guilhermeleobas/236/base -> origin/gh/guilhermeleobas/236/base
2025-12-04T11:11:09.6214011Z  * [new branch]                gh/guilhermeleobas/236/head -> origin/gh/guilhermeleobas/236/head
2025-12-04T11:11:09.6214103Z  * [new branch]                gh/guilhermeleobas/236/orig -> origin/gh/guilhermeleobas/236/orig
2025-12-04T11:11:09.6214193Z  * [new branch]                gh/guilhermeleobas/247/base -> origin/gh/guilhermeleobas/247/base
2025-12-04T11:11:09.6214282Z  * [new branch]                gh/guilhermeleobas/247/head -> origin/gh/guilhermeleobas/247/head
2025-12-04T11:11:09.6214378Z  * [new branch]                gh/guilhermeleobas/247/orig -> origin/gh/guilhermeleobas/247/orig
2025-12-04T11:11:09.6214468Z  * [new branch]                gh/guilhermeleobas/248/base -> origin/gh/guilhermeleobas/248/base
2025-12-04T11:11:09.6214558Z  * [new branch]                gh/guilhermeleobas/248/head -> origin/gh/guilhermeleobas/248/head
2025-12-04T11:11:09.6214652Z  * [new branch]                gh/guilhermeleobas/248/orig -> origin/gh/guilhermeleobas/248/orig
2025-12-04T11:11:09.6214740Z  * [new branch]                gh/guilhermeleobas/250/base -> origin/gh/guilhermeleobas/250/base
2025-12-04T11:11:09.6214833Z  * [new branch]                gh/guilhermeleobas/250/head -> origin/gh/guilhermeleobas/250/head
2025-12-04T11:11:09.6214922Z  * [new branch]                gh/guilhermeleobas/250/orig -> origin/gh/guilhermeleobas/250/orig
2025-12-04T11:11:09.6215013Z  * [new branch]                gh/guilhermeleobas/253/base -> origin/gh/guilhermeleobas/253/base
2025-12-04T11:11:09.6215126Z  * [new branch]                gh/guilhermeleobas/253/head -> origin/gh/guilhermeleobas/253/head
2025-12-04T11:11:09.6215217Z  * [new branch]                gh/guilhermeleobas/253/orig -> origin/gh/guilhermeleobas/253/orig
2025-12-04T11:11:09.6215307Z  * [new branch]                gh/guilhermeleobas/254/base -> origin/gh/guilhermeleobas/254/base
2025-12-04T11:11:09.6215420Z  * [new branch]                gh/guilhermeleobas/254/head -> origin/gh/guilhermeleobas/254/head
2025-12-04T11:11:09.6215510Z  * [new branch]                gh/guilhermeleobas/254/orig -> origin/gh/guilhermeleobas/254/orig
2025-12-04T11:11:09.6215599Z  * [new branch]                gh/guilhermeleobas/255/base -> origin/gh/guilhermeleobas/255/base
2025-12-04T11:11:09.6215693Z  * [new branch]                gh/guilhermeleobas/255/head -> origin/gh/guilhermeleobas/255/head
2025-12-04T11:11:09.6215783Z  * [new branch]                gh/guilhermeleobas/255/orig -> origin/gh/guilhermeleobas/255/orig
2025-12-04T11:11:09.6215874Z  * [new branch]                gh/guilhermeleobas/256/base -> origin/gh/guilhermeleobas/256/base
2025-12-04T11:11:09.6215968Z  * [new branch]                gh/guilhermeleobas/256/head -> origin/gh/guilhermeleobas/256/head
2025-12-04T11:11:09.6216058Z  * [new branch]                gh/guilhermeleobas/256/orig -> origin/gh/guilhermeleobas/256/orig
2025-12-04T11:11:09.6216150Z  * [new branch]                gh/guilhermeleobas/257/base -> origin/gh/guilhermeleobas/257/base
2025-12-04T11:11:09.6216241Z  * [new branch]                gh/guilhermeleobas/257/head -> origin/gh/guilhermeleobas/257/head
2025-12-04T11:11:09.6216330Z  * [new branch]                gh/guilhermeleobas/257/orig -> origin/gh/guilhermeleobas/257/orig
2025-12-04T11:11:09.6216425Z  * [new branch]                gh/guilhermeleobas/258/base -> origin/gh/guilhermeleobas/258/base
2025-12-04T11:11:09.6216516Z  * [new branch]                gh/guilhermeleobas/258/head -> origin/gh/guilhermeleobas/258/head
2025-12-04T11:11:09.6216606Z  * [new branch]                gh/guilhermeleobas/258/orig -> origin/gh/guilhermeleobas/258/orig
2025-12-04T11:11:09.6216701Z  * [new branch]                gh/guilhermeleobas/259/base -> origin/gh/guilhermeleobas/259/base
2025-12-04T11:11:09.6216792Z  * [new branch]                gh/guilhermeleobas/259/head -> origin/gh/guilhermeleobas/259/head
2025-12-04T11:11:09.6216884Z  * [new branch]                gh/guilhermeleobas/259/orig -> origin/gh/guilhermeleobas/259/orig
2025-12-04T11:11:09.6216981Z  * [new branch]                gh/guilhermeleobas/260/base -> origin/gh/guilhermeleobas/260/base
2025-12-04T11:11:09.6217071Z  * [new branch]                gh/guilhermeleobas/260/head -> origin/gh/guilhermeleobas/260/head
2025-12-04T11:11:09.6217162Z  * [new branch]                gh/guilhermeleobas/260/orig -> origin/gh/guilhermeleobas/260/orig
2025-12-04T11:11:09.6217255Z  * [new branch]                gh/guilhermeleobas/261/base -> origin/gh/guilhermeleobas/261/base
2025-12-04T11:11:09.6217345Z  * [new branch]                gh/guilhermeleobas/261/head -> origin/gh/guilhermeleobas/261/head
2025-12-04T11:11:09.6217437Z  * [new branch]                gh/guilhermeleobas/261/orig -> origin/gh/guilhermeleobas/261/orig
2025-12-04T11:11:09.6217532Z  * [new branch]                gh/guilhermeleobas/262/base -> origin/gh/guilhermeleobas/262/base
2025-12-04T11:11:09.6217624Z  * [new branch]                gh/guilhermeleobas/262/head -> origin/gh/guilhermeleobas/262/head
2025-12-04T11:11:09.6217711Z  * [new branch]                gh/guilhermeleobas/262/orig -> origin/gh/guilhermeleobas/262/orig
2025-12-04T11:11:09.6217805Z  * [new branch]                gh/guilhermeleobas/263/base -> origin/gh/guilhermeleobas/263/base
2025-12-04T11:11:09.6217896Z  * [new branch]                gh/guilhermeleobas/263/head -> origin/gh/guilhermeleobas/263/head
2025-12-04T11:11:09.6217990Z  * [new branch]                gh/guilhermeleobas/263/orig -> origin/gh/guilhermeleobas/263/orig
2025-12-04T11:11:09.6218081Z  * [new branch]                gh/guilhermeleobas/264/base -> origin/gh/guilhermeleobas/264/base
2025-12-04T11:11:09.6218253Z  * [new branch]                gh/guilhermeleobas/264/head -> origin/gh/guilhermeleobas/264/head
2025-12-04T11:11:09.6218349Z  * [new branch]                gh/guilhermeleobas/264/orig -> origin/gh/guilhermeleobas/264/orig
2025-12-04T11:11:09.6218464Z  * [new branch]                gh/guilhermeleobas/265/base -> origin/gh/guilhermeleobas/265/base
2025-12-04T11:11:09.6218553Z  * [new branch]                gh/guilhermeleobas/265/head -> origin/gh/guilhermeleobas/265/head
2025-12-04T11:11:09.6218647Z  * [new branch]                gh/guilhermeleobas/265/orig -> origin/gh/guilhermeleobas/265/orig
2025-12-04T11:11:09.6218738Z  * [new branch]                gh/guilhermeleobas/266/base -> origin/gh/guilhermeleobas/266/base
2025-12-04T11:11:09.6218828Z  * [new branch]                gh/guilhermeleobas/266/head -> origin/gh/guilhermeleobas/266/head
2025-12-04T11:11:09.6218922Z  * [new branch]                gh/guilhermeleobas/266/orig -> origin/gh/guilhermeleobas/266/orig
2025-12-04T11:11:09.6219013Z  * [new branch]                gh/guilhermeleobas/267/base -> origin/gh/guilhermeleobas/267/base
2025-12-04T11:11:09.6219147Z  * [new branch]                gh/guilhermeleobas/267/head -> origin/gh/guilhermeleobas/267/head
2025-12-04T11:11:09.6219294Z  * [new branch]                gh/guilhermeleobas/267/orig -> origin/gh/guilhermeleobas/267/orig
2025-12-04T11:11:09.6219382Z  * [new branch]                gh/hameerabbasi/1/base  -> origin/gh/hameerabbasi/1/base
2025-12-04T11:11:09.6219465Z  * [new branch]                gh/hameerabbasi/1/head  -> origin/gh/hameerabbasi/1/head
2025-12-04T11:11:09.6219548Z  * [new branch]                gh/hameerabbasi/2/base  -> origin/gh/hameerabbasi/2/base
2025-12-04T11:11:09.6219626Z  * [new branch]                gh/hameerabbasi/2/head  -> origin/gh/hameerabbasi/2/head
2025-12-04T11:11:09.6219707Z  * [new branch]                gh/hameerabbasi/2/orig  -> origin/gh/hameerabbasi/2/orig
2025-12-04T11:11:09.6219785Z  * [new branch]                gh/hameerabbasi/3/base  -> origin/gh/hameerabbasi/3/base
2025-12-04T11:11:09.6219860Z  * [new branch]                gh/hameerabbasi/3/head  -> origin/gh/hameerabbasi/3/head
2025-12-04T11:11:09.6219941Z  * [new branch]                gh/hameerabbasi/3/orig  -> origin/gh/hameerabbasi/3/orig
2025-12-04T11:11:09.6220019Z  * [new branch]                gh/hameerabbasi/4/base  -> origin/gh/hameerabbasi/4/base
2025-12-04T11:11:09.6220097Z  * [new branch]                gh/hameerabbasi/4/head  -> origin/gh/hameerabbasi/4/head
2025-12-04T11:11:09.6220177Z  * [new branch]                gh/hameerabbasi/4/orig  -> origin/gh/hameerabbasi/4/orig
2025-12-04T11:11:09.6220248Z  * [new branch]                gh/huydhn/1/next        -> origin/gh/huydhn/1/next
2025-12-04T11:11:09.6220318Z  * [new branch]                gh/huydhn/2/next        -> origin/gh/huydhn/2/next
2025-12-04T11:11:09.6220391Z  * [new branch]                gh/huydhn/3/next        -> origin/gh/huydhn/3/next
2025-12-04T11:11:09.6220460Z  * [new branch]                gh/huydhn/4/next        -> origin/gh/huydhn/4/next
2025-12-04T11:11:09.6220530Z  * [new branch]                gh/huydhn/5/next        -> origin/gh/huydhn/5/next
2025-12-04T11:11:09.6220602Z  * [new branch]                gh/huydhn/6/next        -> origin/gh/huydhn/6/next
2025-12-04T11:11:09.6220672Z  * [new branch]                gh/int3/97/base         -> origin/gh/int3/97/base
2025-12-04T11:11:09.6220741Z  * [new branch]                gh/int3/97/head         -> origin/gh/int3/97/head
2025-12-04T11:11:09.6220817Z  * [new branch]                gh/isuruf/101/base      -> origin/gh/isuruf/101/base
2025-12-04T11:11:09.6220889Z  * [new branch]                gh/isuruf/101/head      -> origin/gh/isuruf/101/head
2025-12-04T11:11:09.6220960Z  * [new branch]                gh/isuruf/146/base      -> origin/gh/isuruf/146/base
2025-12-04T11:11:09.6221031Z  * [new branch]                gh/isuruf/146/head      -> origin/gh/isuruf/146/head
2025-12-04T11:11:09.6221101Z  * [new branch]                gh/isuruf/146/orig      -> origin/gh/isuruf/146/orig
2025-12-04T11:11:09.6221196Z  * [new branch]                gh/isuruf/158/base      -> origin/gh/isuruf/158/base
2025-12-04T11:11:09.6221271Z  * [new branch]                gh/isuruf/158/head      -> origin/gh/isuruf/158/head
2025-12-04T11:11:09.6221365Z  * [new branch]                gh/isuruf/159/base      -> origin/gh/isuruf/159/base
2025-12-04T11:11:09.6221438Z  * [new branch]                gh/isuruf/159/head      -> origin/gh/isuruf/159/head
2025-12-04T11:11:09.6221508Z  * [new branch]                gh/isuruf/160/base      -> origin/gh/isuruf/160/base
2025-12-04T11:11:09.6221577Z  * [new branch]                gh/isuruf/160/head      -> origin/gh/isuruf/160/head
2025-12-04T11:11:09.6221651Z  * [new branch]                gh/isuruf/160/orig      -> origin/gh/isuruf/160/orig
2025-12-04T11:11:09.6221723Z  * [new branch]                gh/isuruf/81/base       -> origin/gh/isuruf/81/base
2025-12-04T11:11:09.6221794Z  * [new branch]                gh/isuruf/81/head       -> origin/gh/isuruf/81/head
2025-12-04T11:11:09.6221869Z  * [new branch]                gh/isuruf/81/orig       -> origin/gh/isuruf/81/orig
2025-12-04T11:11:09.6221946Z  * [new branch]                gh/jamesjwu/176/base    -> origin/gh/jamesjwu/176/base
2025-12-04T11:11:09.6222025Z  * [new branch]                gh/jamesjwu/176/head    -> origin/gh/jamesjwu/176/head
2025-12-04T11:11:09.6222105Z  * [new branch]                gh/jamesjwu/176/orig    -> origin/gh/jamesjwu/176/orig
2025-12-04T11:11:09.6222176Z  * [new branch]                gh/jamesjwu/187/base    -> origin/gh/jamesjwu/187/base
2025-12-04T11:11:09.6222250Z  * [new branch]                gh/jamesjwu/187/head    -> origin/gh/jamesjwu/187/head
2025-12-04T11:11:09.6222326Z  * [new branch]                gh/jamesjwu/187/orig    -> origin/gh/jamesjwu/187/orig
2025-12-04T11:11:09.6222399Z  * [new branch]                gh/jamesjwu/196/base    -> origin/gh/jamesjwu/196/base
2025-12-04T11:11:09.6222470Z  * [new branch]                gh/jamesjwu/196/head    -> origin/gh/jamesjwu/196/head
2025-12-04T11:11:09.6222548Z  * [new branch]                gh/jamesjwu/196/orig    -> origin/gh/jamesjwu/196/orig
2025-12-04T11:11:09.6222621Z  * [new branch]                gh/jamesjwu/198/base    -> origin/gh/jamesjwu/198/base
2025-12-04T11:11:09.6222695Z  * [new branch]                gh/jamesjwu/198/head    -> origin/gh/jamesjwu/198/head
2025-12-04T11:11:09.6222775Z  * [new branch]                gh/jamesjwu/198/orig    -> origin/gh/jamesjwu/198/orig
2025-12-04T11:11:09.6222847Z  * [new branch]                gh/jamesjwu/207/base    -> origin/gh/jamesjwu/207/base
2025-12-04T11:11:09.6222919Z  * [new branch]                gh/jamesjwu/207/head    -> origin/gh/jamesjwu/207/head
2025-12-04T11:11:09.6222996Z  * [new branch]                gh/jamesjwu/207/orig    -> origin/gh/jamesjwu/207/orig
2025-12-04T11:11:09.6223069Z  * [new branch]                gh/jamesjwu/208/base    -> origin/gh/jamesjwu/208/base
2025-12-04T11:11:09.6223147Z  * [new branch]                gh/jamesjwu/208/head    -> origin/gh/jamesjwu/208/head
2025-12-04T11:11:09.6223218Z  * [new branch]                gh/jamesjwu/208/orig    -> origin/gh/jamesjwu/208/orig
2025-12-04T11:11:09.6223291Z  * [new branch]                gh/jamesjwu/52/base     -> origin/gh/jamesjwu/52/base
2025-12-04T11:11:09.6223371Z  * [new branch]                gh/jamesjwu/52/head     -> origin/gh/jamesjwu/52/head
2025-12-04T11:11:09.6223444Z  * [new branch]                gh/jamesjwu/53/base     -> origin/gh/jamesjwu/53/base
2025-12-04T11:11:09.6223515Z  * [new branch]                gh/jamesjwu/53/head     -> origin/gh/jamesjwu/53/head
2025-12-04T11:11:09.6223590Z  * [new branch]                gh/jamesjwu/54/base     -> origin/gh/jamesjwu/54/base
2025-12-04T11:11:09.6223660Z  * [new branch]                gh/jamesjwu/54/head     -> origin/gh/jamesjwu/54/head
2025-12-04T11:11:09.6223729Z  * [new branch]                gh/jamesjwu/55/base     -> origin/gh/jamesjwu/55/base
2025-12-04T11:11:09.6223800Z  * [new branch]                gh/jamesjwu/55/head     -> origin/gh/jamesjwu/55/head
2025-12-04T11:11:09.6223901Z  * [new branch]                gh/jamesjwu/56/base     -> origin/gh/jamesjwu/56/base
2025-12-04T11:11:09.6223973Z  * [new branch]                gh/jamesjwu/56/head     -> origin/gh/jamesjwu/56/head
2025-12-04T11:11:09.6224068Z  * [new branch]                gh/jamesjwu/57/base     -> origin/gh/jamesjwu/57/base
2025-12-04T11:11:09.6224139Z  * [new branch]                gh/jamesjwu/57/head     -> origin/gh/jamesjwu/57/head
2025-12-04T11:11:09.6224210Z  * [new branch]                gh/jamesjwu/58/base     -> origin/gh/jamesjwu/58/base
2025-12-04T11:11:09.6224284Z  * [new branch]                gh/jamesjwu/58/head     -> origin/gh/jamesjwu/58/head
2025-12-04T11:11:09.6224353Z  * [new branch]                gh/jamesjwu/59/base     -> origin/gh/jamesjwu/59/base
2025-12-04T11:11:09.6224423Z  * [new branch]                gh/jamesjwu/59/head     -> origin/gh/jamesjwu/59/head
2025-12-04T11:11:09.6224495Z  * [new branch]                gh/jamesjwu/60/base     -> origin/gh/jamesjwu/60/base
2025-12-04T11:11:09.6224566Z  * [new branch]                gh/jamesjwu/60/head     -> origin/gh/jamesjwu/60/head
2025-12-04T11:11:09.6224636Z  * [new branch]                gh/jamesjwu/61/base     -> origin/gh/jamesjwu/61/base
2025-12-04T11:11:09.6224711Z  * [new branch]                gh/jamesjwu/61/head     -> origin/gh/jamesjwu/61/head
2025-12-04T11:11:09.6224781Z  * [new branch]                gh/jamesjwu/62/base     -> origin/gh/jamesjwu/62/base
2025-12-04T11:11:09.6224853Z  * [new branch]                gh/jamesjwu/62/head     -> origin/gh/jamesjwu/62/head
2025-12-04T11:11:09.6224923Z  * [new branch]                gh/jamesjwu/63/base     -> origin/gh/jamesjwu/63/base
2025-12-04T11:11:09.6224993Z  * [new branch]                gh/jamesjwu/63/head     -> origin/gh/jamesjwu/63/head
2025-12-04T11:11:09.6225065Z  * [new branch]                gh/jamesjwu/64/base     -> origin/gh/jamesjwu/64/base
2025-12-04T11:11:09.6225135Z  * [new branch]                gh/jamesjwu/64/head     -> origin/gh/jamesjwu/64/head
2025-12-04T11:11:09.6225207Z  * [new branch]                gh/jamesjwu/65/base     -> origin/gh/jamesjwu/65/base
2025-12-04T11:11:09.6225279Z  * [new branch]                gh/jamesjwu/65/head     -> origin/gh/jamesjwu/65/head
2025-12-04T11:11:09.6225353Z  * [new branch]                gh/janeyx99/165/base    -> origin/gh/janeyx99/165/base
2025-12-04T11:11:09.6225424Z  * [new branch]                gh/janeyx99/165/head    -> origin/gh/janeyx99/165/head
2025-12-04T11:11:09.6225496Z  * [new branch]                gh/janeyx99/165/orig    -> origin/gh/janeyx99/165/orig
2025-12-04T11:11:09.6225567Z  * [new branch]                gh/janeyx99/201/base    -> origin/gh/janeyx99/201/base
2025-12-04T11:11:09.6225638Z  * [new branch]                gh/janeyx99/201/head    -> origin/gh/janeyx99/201/head
2025-12-04T11:11:09.6225712Z  * [new branch]                gh/janeyx99/201/orig    -> origin/gh/janeyx99/201/orig
2025-12-04T11:11:09.6225783Z  * [new branch]                gh/janeyx99/225/base    -> origin/gh/janeyx99/225/base
2025-12-04T11:11:09.6225855Z  * [new branch]                gh/janeyx99/225/head    -> origin/gh/janeyx99/225/head
2025-12-04T11:11:09.6225931Z  * [new branch]                gh/janeyx99/225/orig    -> origin/gh/janeyx99/225/orig
2025-12-04T11:11:09.6226003Z  * [new branch]                gh/janeyx99/299/base    -> origin/gh/janeyx99/299/base
2025-12-04T11:11:09.6226076Z  * [new branch]                gh/janeyx99/299/head    -> origin/gh/janeyx99/299/head
2025-12-04T11:11:09.6226151Z  * [new branch]                gh/janeyx99/299/orig    -> origin/gh/janeyx99/299/orig
2025-12-04T11:11:09.6226220Z  * [new branch]                gh/janeyx99/302/base    -> origin/gh/janeyx99/302/base
2025-12-04T11:11:09.6226292Z  * [new branch]                gh/janeyx99/302/head    -> origin/gh/janeyx99/302/head
2025-12-04T11:11:09.6226364Z  * [new branch]                gh/janeyx99/303/base    -> origin/gh/janeyx99/303/base
2025-12-04T11:11:09.6226454Z  * [new branch]                gh/janeyx99/303/head    -> origin/gh/janeyx99/303/head
2025-12-04T11:11:09.6226524Z  * [new branch]                gh/janeyx99/305/base    -> origin/gh/janeyx99/305/base
2025-12-04T11:11:09.6226594Z  * [new branch]                gh/janeyx99/305/head    -> origin/gh/janeyx99/305/head
2025-12-04T11:11:09.6226687Z  * [new branch]                gh/janeyx99/306/base    -> origin/gh/janeyx99/306/base
2025-12-04T11:11:09.6226760Z  * [new branch]                gh/janeyx99/306/head    -> origin/gh/janeyx99/306/head
2025-12-04T11:11:09.6226832Z  * [new branch]                gh/janeyx99/314/base    -> origin/gh/janeyx99/314/base
2025-12-04T11:11:09.6226903Z  * [new branch]                gh/janeyx99/314/head    -> origin/gh/janeyx99/314/head
2025-12-04T11:11:09.6226978Z  * [new branch]                gh/janeyx99/314/orig    -> origin/gh/janeyx99/314/orig
2025-12-04T11:11:09.6227049Z  * [new branch]                gh/janeyx99/315/base    -> origin/gh/janeyx99/315/base
2025-12-04T11:11:09.6227123Z  * [new branch]                gh/janeyx99/315/head    -> origin/gh/janeyx99/315/head
2025-12-04T11:11:09.6227198Z  * [new branch]                gh/janeyx99/315/orig    -> origin/gh/janeyx99/315/orig
2025-12-04T11:11:09.6227269Z  * [new branch]                gh/janeyx99/316/base    -> origin/gh/janeyx99/316/base
2025-12-04T11:11:09.6227342Z  * [new branch]                gh/janeyx99/316/head    -> origin/gh/janeyx99/316/head
2025-12-04T11:11:09.6227419Z  * [new branch]                gh/janeyx99/316/orig    -> origin/gh/janeyx99/316/orig
2025-12-04T11:11:09.6227491Z  * [new branch]                gh/janeyx99/317/base    -> origin/gh/janeyx99/317/base
2025-12-04T11:11:09.6227564Z  * [new branch]                gh/janeyx99/317/head    -> origin/gh/janeyx99/317/head
2025-12-04T11:11:09.6227639Z  * [new branch]                gh/janeyx99/317/orig    -> origin/gh/janeyx99/317/orig
2025-12-04T11:11:09.6227710Z  * [new branch]                gh/janeyx99/325/base    -> origin/gh/janeyx99/325/base
2025-12-04T11:11:09.6227782Z  * [new branch]                gh/janeyx99/325/head    -> origin/gh/janeyx99/325/head
2025-12-04T11:11:09.6227859Z  * [new branch]                gh/janeyx99/325/orig    -> origin/gh/janeyx99/325/orig
2025-12-04T11:11:09.6227930Z  * [new branch]                gh/janeyx99/327/base    -> origin/gh/janeyx99/327/base
2025-12-04T11:11:09.6228008Z  * [new branch]                gh/janeyx99/327/head    -> origin/gh/janeyx99/327/head
2025-12-04T11:11:09.6228080Z  * [new branch]                gh/janeyx99/327/orig    -> origin/gh/janeyx99/327/orig
2025-12-04T11:11:09.6228201Z  * [new branch]                gh/janeyx99/328/base    -> origin/gh/janeyx99/328/base
2025-12-04T11:11:09.6228277Z  * [new branch]                gh/janeyx99/328/head    -> origin/gh/janeyx99/328/head
2025-12-04T11:11:09.6228348Z  * [new branch]                gh/janeyx99/328/orig    -> origin/gh/janeyx99/328/orig
2025-12-04T11:11:09.6228420Z  * [new branch]                gh/janeyx99/329/base    -> origin/gh/janeyx99/329/base
2025-12-04T11:11:09.6228497Z  * [new branch]                gh/janeyx99/329/head    -> origin/gh/janeyx99/329/head
2025-12-04T11:11:09.6228569Z  * [new branch]                gh/janeyx99/329/orig    -> origin/gh/janeyx99/329/orig
2025-12-04T11:11:09.6228642Z  * [new branch]                gh/janeyx99/330/base    -> origin/gh/janeyx99/330/base
2025-12-04T11:11:09.6228720Z  * [new branch]                gh/janeyx99/330/head    -> origin/gh/janeyx99/330/head
2025-12-04T11:11:09.6228789Z  * [new branch]                gh/janeyx99/330/orig    -> origin/gh/janeyx99/330/orig
2025-12-04T11:11:09.6228859Z  * [new branch]                gh/janeyx99/331/base    -> origin/gh/janeyx99/331/base
2025-12-04T11:11:09.6228938Z  * [new branch]                gh/janeyx99/331/head    -> origin/gh/janeyx99/331/head
2025-12-04T11:11:09.6229011Z  * [new branch]                gh/janeyx99/331/orig    -> origin/gh/janeyx99/331/orig
2025-12-04T11:11:09.6229082Z  * [new branch]                gh/janeyx99/332/base    -> origin/gh/janeyx99/332/base
2025-12-04T11:11:09.6229189Z  * [new branch]                gh/janeyx99/332/head    -> origin/gh/janeyx99/332/head
2025-12-04T11:11:09.6229262Z  * [new branch]                gh/janeyx99/332/orig    -> origin/gh/janeyx99/332/orig
2025-12-04T11:11:09.6229332Z  * [new branch]                gh/janeyx99/333/base    -> origin/gh/janeyx99/333/base
2025-12-04T11:11:09.6229436Z  * [new branch]                gh/janeyx99/333/head    -> origin/gh/janeyx99/333/head
2025-12-04T11:11:09.6229508Z  * [new branch]                gh/janeyx99/333/orig    -> origin/gh/janeyx99/333/orig
2025-12-04T11:11:09.6229582Z  * [new branch]                gh/janeyx99/88/base     -> origin/gh/janeyx99/88/base
2025-12-04T11:11:09.6229653Z  * [new branch]                gh/janeyx99/88/head     -> origin/gh/janeyx99/88/head
2025-12-04T11:11:09.6229723Z  * [new branch]                gh/janeyx99/88/orig     -> origin/gh/janeyx99/88/orig
2025-12-04T11:11:09.6229799Z  * [new branch]                gh/jansel/360/base      -> origin/gh/jansel/360/base
2025-12-04T11:11:09.6229874Z  * [new branch]                gh/jansel/360/head      -> origin/gh/jansel/360/head
2025-12-04T11:11:09.6229944Z  * [new branch]                gh/jansel/451/base      -> origin/gh/jansel/451/base
2025-12-04T11:11:09.6230017Z  * [new branch]                gh/jansel/451/head      -> origin/gh/jansel/451/head
2025-12-04T11:11:09.6230088Z  * [new branch]                gh/jansel/451/orig      -> origin/gh/jansel/451/orig
2025-12-04T11:11:09.6230155Z  * [new branch]                gh/jansel/462/base      -> origin/gh/jansel/462/base
2025-12-04T11:11:09.6230227Z  * [new branch]                gh/jansel/462/head      -> origin/gh/jansel/462/head
2025-12-04T11:11:09.6230295Z  * [new branch]                gh/jansel/462/orig      -> origin/gh/jansel/462/orig
2025-12-04T11:11:09.6230364Z  * [new branch]                gh/jansel/533/base      -> origin/gh/jansel/533/base
2025-12-04T11:11:09.6230436Z  * [new branch]                gh/jansel/533/head      -> origin/gh/jansel/533/head
2025-12-04T11:11:09.6230506Z  * [new branch]                gh/jansel/533/orig      -> origin/gh/jansel/533/orig
2025-12-04T11:11:09.6230578Z  * [new branch]                gh/jansel/552/base      -> origin/gh/jansel/552/base
2025-12-04T11:11:09.6230651Z  * [new branch]                gh/jansel/552/head      -> origin/gh/jansel/552/head
2025-12-04T11:11:09.6230721Z  * [new branch]                gh/jansel/552/orig      -> origin/gh/jansel/552/orig
2025-12-04T11:11:09.6230791Z  * [new branch]                gh/jansel/553/base      -> origin/gh/jansel/553/base
2025-12-04T11:11:09.6230865Z  * [new branch]                gh/jansel/553/head      -> origin/gh/jansel/553/head
2025-12-04T11:11:09.6230936Z  * [new branch]                gh/jansel/553/orig      -> origin/gh/jansel/553/orig
2025-12-04T11:11:09.6231005Z  * [new branch]                gh/jansel/554/base      -> origin/gh/jansel/554/base
2025-12-04T11:11:09.6231079Z  * [new branch]                gh/jansel/554/head      -> origin/gh/jansel/554/head
2025-12-04T11:11:09.6231150Z  * [new branch]                gh/jansel/554/orig      -> origin/gh/jansel/554/orig
2025-12-04T11:11:09.6231219Z  * [new branch]                gh/jansel/555/base      -> origin/gh/jansel/555/base
2025-12-04T11:11:09.6231293Z  * [new branch]                gh/jansel/555/head      -> origin/gh/jansel/555/head
2025-12-04T11:11:09.6231364Z  * [new branch]                gh/jansel/555/orig      -> origin/gh/jansel/555/orig
2025-12-04T11:11:09.6231437Z  * [new branch]                gh/jansel/556/base      -> origin/gh/jansel/556/base
2025-12-04T11:11:09.6231507Z  * [new branch]                gh/jansel/556/head      -> origin/gh/jansel/556/head
2025-12-04T11:11:09.6231576Z  * [new branch]                gh/jansel/556/orig      -> origin/gh/jansel/556/orig
2025-12-04T11:11:09.6231649Z  * [new branch]                gh/jansel/557/base      -> origin/gh/jansel/557/base
2025-12-04T11:11:09.6231719Z  * [new branch]                gh/jansel/557/head      -> origin/gh/jansel/557/head
2025-12-04T11:11:09.6231807Z  * [new branch]                gh/jansel/557/orig      -> origin/gh/jansel/557/orig
2025-12-04T11:11:09.6231882Z  * [new branch]                gh/jansel/558/base      -> origin/gh/jansel/558/base
2025-12-04T11:11:09.6231953Z  * [new branch]                gh/jansel/558/head      -> origin/gh/jansel/558/head
2025-12-04T11:11:09.6232050Z  * [new branch]                gh/jansel/558/orig      -> origin/gh/jansel/558/orig
2025-12-04T11:11:09.6232124Z  * [new branch]                gh/jansel/559/base      -> origin/gh/jansel/559/base
2025-12-04T11:11:09.6232194Z  * [new branch]                gh/jansel/559/head      -> origin/gh/jansel/559/head
2025-12-04T11:11:09.6232262Z  * [new branch]                gh/jansel/559/orig      -> origin/gh/jansel/559/orig
2025-12-04T11:11:09.6232335Z  * [new branch]                gh/jansel/560/base      -> origin/gh/jansel/560/base
2025-12-04T11:11:09.6232405Z  * [new branch]                gh/jansel/560/head      -> origin/gh/jansel/560/head
2025-12-04T11:11:09.6232475Z  * [new branch]                gh/jansel/560/orig      -> origin/gh/jansel/560/orig
2025-12-04T11:11:09.6232549Z  * [new branch]                gh/jansel/561/base      -> origin/gh/jansel/561/base
2025-12-04T11:11:09.6232619Z  * [new branch]                gh/jansel/561/head      -> origin/gh/jansel/561/head
2025-12-04T11:11:09.6232690Z  * [new branch]                gh/jansel/561/orig      -> origin/gh/jansel/561/orig
2025-12-04T11:11:09.6232763Z  * [new branch]                gh/jansel/562/base      -> origin/gh/jansel/562/base
2025-12-04T11:11:09.6232832Z  * [new branch]                gh/jansel/562/head      -> origin/gh/jansel/562/head
2025-12-04T11:11:09.6232901Z  * [new branch]                gh/jansel/562/orig      -> origin/gh/jansel/562/orig
2025-12-04T11:11:09.6232974Z  * [new branch]                gh/jansel/563/base      -> origin/gh/jansel/563/base
2025-12-04T11:11:09.6233043Z  * [new branch]                gh/jansel/563/head      -> origin/gh/jansel/563/head
2025-12-04T11:11:09.6233117Z  * [new branch]                gh/jansel/563/orig      -> origin/gh/jansel/563/orig
2025-12-04T11:11:09.6233187Z  * [new branch]                gh/jansel/564/base      -> origin/gh/jansel/564/base
2025-12-04T11:11:09.6233257Z  * [new branch]                gh/jansel/564/head      -> origin/gh/jansel/564/head
2025-12-04T11:11:09.6233331Z  * [new branch]                gh/jansel/564/orig      -> origin/gh/jansel/564/orig
2025-12-04T11:11:09.6233400Z  * [new branch]                gh/jansel/565/base      -> origin/gh/jansel/565/base
2025-12-04T11:11:09.6233469Z  * [new branch]                gh/jansel/565/head      -> origin/gh/jansel/565/head
2025-12-04T11:11:09.6233544Z  * [new branch]                gh/jansel/565/orig      -> origin/gh/jansel/565/orig
2025-12-04T11:11:09.6233612Z  * [new branch]                gh/jansel/566/base      -> origin/gh/jansel/566/base
2025-12-04T11:11:09.6233683Z  * [new branch]                gh/jansel/566/head      -> origin/gh/jansel/566/head
2025-12-04T11:11:09.6233757Z  * [new branch]                gh/jansel/566/orig      -> origin/gh/jansel/566/orig
2025-12-04T11:11:09.6233826Z  * [new branch]                gh/jansel/567/base      -> origin/gh/jansel/567/base
2025-12-04T11:11:09.6233896Z  * [new branch]                gh/jansel/567/head      -> origin/gh/jansel/567/head
2025-12-04T11:11:09.6233970Z  * [new branch]                gh/jansel/567/orig      -> origin/gh/jansel/567/orig
2025-12-04T11:11:09.6234039Z  * [new branch]                gh/jansel/568/base      -> origin/gh/jansel/568/base
2025-12-04T11:11:09.6234109Z  * [new branch]                gh/jansel/568/head      -> origin/gh/jansel/568/head
2025-12-04T11:11:09.6234184Z  * [new branch]                gh/jansel/568/orig      -> origin/gh/jansel/568/orig
2025-12-04T11:11:09.6234253Z  * [new branch]                gh/jansel/569/base      -> origin/gh/jansel/569/base
2025-12-04T11:11:09.6234322Z  * [new branch]                gh/jansel/569/head      -> origin/gh/jansel/569/head
2025-12-04T11:11:09.6234417Z  * [new branch]                gh/jansel/569/orig      -> origin/gh/jansel/569/orig
2025-12-04T11:11:09.6234487Z  * [new branch]                gh/jansel/570/base      -> origin/gh/jansel/570/base
2025-12-04T11:11:09.6234557Z  * [new branch]                gh/jansel/570/head      -> origin/gh/jansel/570/head
2025-12-04T11:11:09.6234652Z  * [new branch]                gh/jansel/570/orig      -> origin/gh/jansel/570/orig
2025-12-04T11:11:09.6234722Z  * [new branch]                gh/jansel/571/base      -> origin/gh/jansel/571/base
2025-12-04T11:11:09.6234796Z  * [new branch]                gh/jansel/571/head      -> origin/gh/jansel/571/head
2025-12-04T11:11:09.6234865Z  * [new branch]                gh/jansel/571/orig      -> origin/gh/jansel/571/orig
2025-12-04T11:11:09.6234934Z  * [new branch]                gh/jansel/572/base      -> origin/gh/jansel/572/base
2025-12-04T11:11:09.6235008Z  * [new branch]                gh/jansel/572/head      -> origin/gh/jansel/572/head
2025-12-04T11:11:09.6235079Z  * [new branch]                gh/jansel/572/orig      -> origin/gh/jansel/572/orig
2025-12-04T11:11:09.6235151Z  * [new branch]                gh/jansel/573/base      -> origin/gh/jansel/573/base
2025-12-04T11:11:09.6235224Z  * [new branch]                gh/jansel/573/head      -> origin/gh/jansel/573/head
2025-12-04T11:11:09.6235294Z  * [new branch]                gh/jansel/573/orig      -> origin/gh/jansel/573/orig
2025-12-04T11:11:09.6235363Z  * [new branch]                gh/jansel/574/base      -> origin/gh/jansel/574/base
2025-12-04T11:11:09.6235433Z  * [new branch]                gh/jansel/574/head      -> origin/gh/jansel/574/head
2025-12-04T11:11:09.6235501Z  * [new branch]                gh/jansel/574/orig      -> origin/gh/jansel/574/orig
2025-12-04T11:11:09.6235568Z  * [new branch]                gh/jansel/575/base      -> origin/gh/jansel/575/base
2025-12-04T11:11:09.6235637Z  * [new branch]                gh/jansel/575/head      -> origin/gh/jansel/575/head
2025-12-04T11:11:09.6235703Z  * [new branch]                gh/jansel/575/orig      -> origin/gh/jansel/575/orig
2025-12-04T11:11:09.6235774Z  * [new branch]                gh/jansel/576/base      -> origin/gh/jansel/576/base
2025-12-04T11:11:09.6235848Z  * [new branch]                gh/jansel/576/head      -> origin/gh/jansel/576/head
2025-12-04T11:11:09.6235919Z  * [new branch]                gh/jansel/576/orig      -> origin/gh/jansel/576/orig
2025-12-04T11:11:09.6236002Z  * [new branch]                gh/jbschlosser/247/base -> origin/gh/jbschlosser/247/base
2025-12-04T11:11:09.6236088Z  * [new branch]                gh/jbschlosser/247/head -> origin/gh/jbschlosser/247/head
2025-12-04T11:11:09.6236167Z  * [new branch]                gh/jbschlosser/247/orig -> origin/gh/jbschlosser/247/orig
2025-12-04T11:11:09.6236244Z  * [new branch]                gh/jbschlosser/250/base -> origin/gh/jbschlosser/250/base
2025-12-04T11:11:09.6236325Z  * [new branch]                gh/jbschlosser/250/head -> origin/gh/jbschlosser/250/head
2025-12-04T11:11:09.6236403Z  * [new branch]                gh/jbschlosser/250/orig -> origin/gh/jbschlosser/250/orig
2025-12-04T11:11:09.6236482Z  * [new branch]                gh/jerryzh168/1/base    -> origin/gh/jerryzh168/1/base
2025-12-04T11:11:09.6236556Z  * [new branch]                gh/jerryzh168/1/head    -> origin/gh/jerryzh168/1/head
2025-12-04T11:11:09.6236632Z  * [new branch]                gh/jerryzh168/1/orig    -> origin/gh/jerryzh168/1/orig
2025-12-04T11:11:09.6236712Z  * [new branch]                gh/jiayisunx/59/base    -> origin/gh/jiayisunx/59/base
2025-12-04T11:11:09.6236786Z  * [new branch]                gh/jiayisunx/59/head    -> origin/gh/jiayisunx/59/head
2025-12-04T11:11:09.6236861Z  * [new branch]                gh/jiayisunx/59/orig    -> origin/gh/jiayisunx/59/orig
2025-12-04T11:11:09.6236937Z  * [new branch]                gh/jiayisunx/61/base    -> origin/gh/jiayisunx/61/base
2025-12-04T11:11:09.6237010Z  * [new branch]                gh/jiayisunx/61/head    -> origin/gh/jiayisunx/61/head
2025-12-04T11:11:09.6237108Z  * [new branch]                gh/jiayisunx/61/orig    -> origin/gh/jiayisunx/61/orig
2025-12-04T11:11:09.6237185Z  * [new branch]                gh/jiayisunx/68/base    -> origin/gh/jiayisunx/68/base
2025-12-04T11:11:09.6237258Z  * [new branch]                gh/jiayisunx/68/head    -> origin/gh/jiayisunx/68/head
2025-12-04T11:11:09.6237354Z  * [new branch]                gh/jiayisunx/68/orig    -> origin/gh/jiayisunx/68/orig
2025-12-04T11:11:09.6237432Z  * [new branch]                gh/jiayisunx/77/base    -> origin/gh/jiayisunx/77/base
2025-12-04T11:11:09.6237506Z  * [new branch]                gh/jiayisunx/77/head    -> origin/gh/jiayisunx/77/head
2025-12-04T11:11:09.6237579Z  * [new branch]                gh/jiayisunx/77/orig    -> origin/gh/jiayisunx/77/orig
2025-12-04T11:11:09.6237656Z  * [new branch]                gh/jiayisunx/78/base    -> origin/gh/jiayisunx/78/base
2025-12-04T11:11:09.6237730Z  * [new branch]                gh/jiayisunx/78/head    -> origin/gh/jiayisunx/78/head
2025-12-04T11:11:09.6237805Z  * [new branch]                gh/jiayisunx/78/orig    -> origin/gh/jiayisunx/78/orig
2025-12-04T11:11:09.6237881Z  * [new branch]                gh/jiayisunx/79/base    -> origin/gh/jiayisunx/79/base
2025-12-04T11:11:09.6237953Z  * [new branch]                gh/jiayisunx/79/head    -> origin/gh/jiayisunx/79/head
2025-12-04T11:11:09.6238031Z  * [new branch]                gh/jiayisunx/79/orig    -> origin/gh/jiayisunx/79/orig
2025-12-04T11:11:09.6238103Z  * [new branch]                gh/jiayisunx/82/base    -> origin/gh/jiayisunx/82/base
2025-12-04T11:11:09.6238210Z  * [new branch]                gh/jiayisunx/82/head    -> origin/gh/jiayisunx/82/head
2025-12-04T11:11:09.6238290Z  * [new branch]                gh/jiayisunx/82/orig    -> origin/gh/jiayisunx/82/orig
2025-12-04T11:11:09.6238365Z  * [new branch]                gh/jiayisunx/83/base    -> origin/gh/jiayisunx/83/base
2025-12-04T11:11:09.6238438Z  * [new branch]                gh/jiayisunx/83/head    -> origin/gh/jiayisunx/83/head
2025-12-04T11:11:09.6238519Z  * [new branch]                gh/jiayisunx/83/orig    -> origin/gh/jiayisunx/83/orig
2025-12-04T11:11:09.6238593Z  * [new branch]                gh/jiayisunx/84/base    -> origin/gh/jiayisunx/84/base
2025-12-04T11:11:09.6238668Z  * [new branch]                gh/jiayisunx/84/head    -> origin/gh/jiayisunx/84/head
2025-12-04T11:11:09.6238748Z  * [new branch]                gh/jiayisunx/84/orig    -> origin/gh/jiayisunx/84/orig
2025-12-04T11:11:09.6238821Z  * [new branch]                gh/jiayisunx/85/base    -> origin/gh/jiayisunx/85/base
2025-12-04T11:11:09.6238895Z  * [new branch]                gh/jiayisunx/85/head    -> origin/gh/jiayisunx/85/head
2025-12-04T11:11:09.6238974Z  * [new branch]                gh/jiayisunx/85/orig    -> origin/gh/jiayisunx/85/orig
2025-12-04T11:11:09.6239049Z  * [new branch]                gh/jiayisunx/86/base    -> origin/gh/jiayisunx/86/base
2025-12-04T11:11:09.6239122Z  * [new branch]                gh/jiayisunx/86/head    -> origin/gh/jiayisunx/86/head
2025-12-04T11:11:09.6239197Z  * [new branch]                gh/jiayisunx/86/orig    -> origin/gh/jiayisunx/86/orig
2025-12-04T11:11:09.6239271Z  * [new branch]                gh/jiayisunx/87/base    -> origin/gh/jiayisunx/87/base
2025-12-04T11:11:09.6239345Z  * [new branch]                gh/jiayisunx/87/head    -> origin/gh/jiayisunx/87/head
2025-12-04T11:11:09.6239420Z  * [new branch]                gh/jiayisunx/87/orig    -> origin/gh/jiayisunx/87/orig
2025-12-04T11:11:09.6239494Z  * [new branch]                gh/jiayisunx/88/base    -> origin/gh/jiayisunx/88/base
2025-12-04T11:11:09.6239566Z  * [new branch]                gh/jiayisunx/88/head    -> origin/gh/jiayisunx/88/head
2025-12-04T11:11:09.6239643Z  * [new branch]                gh/jiayisunx/88/orig    -> origin/gh/jiayisunx/88/orig
2025-12-04T11:11:09.6239716Z  * [new branch]                gh/jiayisunx/89/base    -> origin/gh/jiayisunx/89/base
2025-12-04T11:11:09.6239792Z  * [new branch]                gh/jiayisunx/89/head    -> origin/gh/jiayisunx/89/head
2025-12-04T11:11:09.6239896Z  * [new branch]                gh/jiayisunx/89/orig    -> origin/gh/jiayisunx/89/orig
2025-12-04T11:11:09.6239969Z  * [new branch]                gh/jiayisunx/90/base    -> origin/gh/jiayisunx/90/base
2025-12-04T11:11:09.6240071Z  * [new branch]                gh/jiayisunx/90/head    -> origin/gh/jiayisunx/90/head
2025-12-04T11:11:09.6240144Z  * [new branch]                gh/jiayisunx/90/orig    -> origin/gh/jiayisunx/90/orig
2025-12-04T11:11:09.6240224Z  * [new branch]                gh/jjwu@meta.com/1/base -> origin/gh/jjwu@meta.com/1/base
2025-12-04T11:11:09.6240306Z  * [new branch]                gh/jjwu@meta.com/1/head -> origin/gh/jjwu@meta.com/1/head
2025-12-04T11:11:09.6240377Z  * [new branch]                gh/jturney/1/base       -> origin/gh/jturney/1/base
2025-12-04T11:11:09.6240448Z  * [new branch]                gh/jturney/1/head       -> origin/gh/jturney/1/head
2025-12-04T11:11:09.6240520Z  * [new branch]                gh/jturney/1/orig       -> origin/gh/jturney/1/orig
2025-12-04T11:11:09.6240593Z  * [new branch]                gh/jturney/2/base       -> origin/gh/jturney/2/base
2025-12-04T11:11:09.6240663Z  * [new branch]                gh/jturney/2/head       -> origin/gh/jturney/2/head
2025-12-04T11:11:09.6240739Z  * [new branch]                gh/jturney/2/orig       -> origin/gh/jturney/2/orig
2025-12-04T11:11:09.6240818Z  * [new branch]                gh/karthickai/10/base   -> origin/gh/karthickai/10/base
2025-12-04T11:11:09.6240896Z  * [new branch]                gh/karthickai/10/head   -> origin/gh/karthickai/10/head
2025-12-04T11:11:09.6240977Z  * [new branch]                gh/karthickai/10/orig   -> origin/gh/karthickai/10/orig
2025-12-04T11:11:09.6241056Z  * [new branch]                gh/karthickai/11/base   -> origin/gh/karthickai/11/base
2025-12-04T11:11:09.6241132Z  * [new branch]                gh/karthickai/11/head   -> origin/gh/karthickai/11/head
2025-12-04T11:11:09.6241214Z  * [new branch]                gh/karthickai/11/orig   -> origin/gh/karthickai/11/orig
2025-12-04T11:11:09.6241290Z  * [new branch]                gh/karthickai/12/base   -> origin/gh/karthickai/12/base
2025-12-04T11:11:09.6241369Z  * [new branch]                gh/karthickai/12/head   -> origin/gh/karthickai/12/head
2025-12-04T11:11:09.6241445Z  * [new branch]                gh/karthickai/12/orig   -> origin/gh/karthickai/12/orig
2025-12-04T11:11:09.6241519Z  * [new branch]                gh/karthickai/13/base   -> origin/gh/karthickai/13/base
2025-12-04T11:11:09.6241598Z  * [new branch]                gh/karthickai/13/head   -> origin/gh/karthickai/13/head
2025-12-04T11:11:09.6241673Z  * [new branch]                gh/karthickai/13/orig   -> origin/gh/karthickai/13/orig
2025-12-04T11:11:09.6241748Z  * [new branch]                gh/karthickai/14/base   -> origin/gh/karthickai/14/base
2025-12-04T11:11:09.6241825Z  * [new branch]                gh/karthickai/14/head   -> origin/gh/karthickai/14/head
2025-12-04T11:11:09.6241901Z  * [new branch]                gh/karthickai/14/orig   -> origin/gh/karthickai/14/orig
2025-12-04T11:11:09.6241978Z  * [new branch]                gh/karthickai/15/base   -> origin/gh/karthickai/15/base
2025-12-04T11:11:09.6242057Z  * [new branch]                gh/karthickai/15/head   -> origin/gh/karthickai/15/head
2025-12-04T11:11:09.6242133Z  * [new branch]                gh/karthickai/15/orig   -> origin/gh/karthickai/15/orig
2025-12-04T11:11:09.6242208Z  * [new branch]                gh/karthickai/16/base   -> origin/gh/karthickai/16/base
2025-12-04T11:11:09.6242287Z  * [new branch]                gh/karthickai/16/head   -> origin/gh/karthickai/16/head
2025-12-04T11:11:09.6242363Z  * [new branch]                gh/karthickai/16/orig   -> origin/gh/karthickai/16/orig
2025-12-04T11:11:09.6242438Z  * [new branch]                gh/karthickai/17/base   -> origin/gh/karthickai/17/base
2025-12-04T11:11:09.6242516Z  * [new branch]                gh/karthickai/17/head   -> origin/gh/karthickai/17/head
2025-12-04T11:11:09.6242613Z  * [new branch]                gh/karthickai/17/orig   -> origin/gh/karthickai/17/orig
2025-12-04T11:11:09.6242689Z  * [new branch]                gh/karthickai/18/base   -> origin/gh/karthickai/18/base
2025-12-04T11:11:09.6242766Z  * [new branch]                gh/karthickai/18/head   -> origin/gh/karthickai/18/head
2025-12-04T11:11:09.6242872Z  * [new branch]                gh/karthickai/18/orig   -> origin/gh/karthickai/18/orig
2025-12-04T11:11:09.6242951Z  * [new branch]                gh/karthickai/19/base   -> origin/gh/karthickai/19/base
2025-12-04T11:11:09.6243024Z  * [new branch]                gh/karthickai/19/head   -> origin/gh/karthickai/19/head
2025-12-04T11:11:09.6243098Z  * [new branch]                gh/karthickai/19/orig   -> origin/gh/karthickai/19/orig
2025-12-04T11:11:09.6243172Z  * [new branch]                gh/karthickai/20/base   -> origin/gh/karthickai/20/base
2025-12-04T11:11:09.6243247Z  * [new branch]                gh/karthickai/20/head   -> origin/gh/karthickai/20/head
2025-12-04T11:11:09.6243324Z  * [new branch]                gh/karthickai/20/orig   -> origin/gh/karthickai/20/orig
2025-12-04T11:11:09.6243402Z  * [new branch]                gh/karthickai/21/base   -> origin/gh/karthickai/21/base
2025-12-04T11:11:09.6243477Z  * [new branch]                gh/karthickai/21/head   -> origin/gh/karthickai/21/head
2025-12-04T11:11:09.6243553Z  * [new branch]                gh/karthickai/21/orig   -> origin/gh/karthickai/21/orig
2025-12-04T11:11:09.6243632Z  * [new branch]                gh/karthickai/22/base   -> origin/gh/karthickai/22/base
2025-12-04T11:11:09.6243706Z  * [new branch]                gh/karthickai/22/head   -> origin/gh/karthickai/22/head
2025-12-04T11:11:09.6243781Z  * [new branch]                gh/karthickai/22/orig   -> origin/gh/karthickai/22/orig
2025-12-04T11:11:09.6243858Z  * [new branch]                gh/karthickai/23/base   -> origin/gh/karthickai/23/base
2025-12-04T11:11:09.6243934Z  * [new branch]                gh/karthickai/23/head   -> origin/gh/karthickai/23/head
2025-12-04T11:11:09.6244009Z  * [new branch]                gh/karthickai/23/orig   -> origin/gh/karthickai/23/orig
2025-12-04T11:11:09.6244088Z  * [new branch]                gh/karthickai/24/base   -> origin/gh/karthickai/24/base
2025-12-04T11:11:09.6244163Z  * [new branch]                gh/karthickai/24/head   -> origin/gh/karthickai/24/head
2025-12-04T11:11:09.6244239Z  * [new branch]                gh/karthickai/24/orig   -> origin/gh/karthickai/24/orig
2025-12-04T11:11:09.6244317Z  * [new branch]                gh/karthickai/25/base   -> origin/gh/karthickai/25/base
2025-12-04T11:11:09.6244392Z  * [new branch]                gh/karthickai/25/head   -> origin/gh/karthickai/25/head
2025-12-04T11:11:09.6244466Z  * [new branch]                gh/karthickai/25/orig   -> origin/gh/karthickai/25/orig
2025-12-04T11:11:09.6244544Z  * [new branch]                gh/karthickai/26/base   -> origin/gh/karthickai/26/base
2025-12-04T11:11:09.6244619Z  * [new branch]                gh/karthickai/26/head   -> origin/gh/karthickai/26/head
2025-12-04T11:11:09.6244698Z  * [new branch]                gh/karthickai/26/orig   -> origin/gh/karthickai/26/orig
2025-12-04T11:11:09.6244771Z  * [new branch]                gh/karthickai/6/base    -> origin/gh/karthickai/6/base
2025-12-04T11:11:09.6244848Z  * [new branch]                gh/karthickai/6/head    -> origin/gh/karthickai/6/head
2025-12-04T11:11:09.6244927Z  * [new branch]                gh/karthickai/6/orig    -> origin/gh/karthickai/6/orig
2025-12-04T11:11:09.6244998Z  * [new branch]                gh/krocki/1/base        -> origin/gh/krocki/1/base
2025-12-04T11:11:09.6245070Z  * [new branch]                gh/krocki/1/head        -> origin/gh/krocki/1/head
2025-12-04T11:11:09.6245143Z  * [new branch]                gh/krocki/1/orig        -> origin/gh/krocki/1/orig
2025-12-04T11:11:09.6245212Z  * [new branch]                gh/krocki/2/base        -> origin/gh/krocki/2/base
2025-12-04T11:11:09.6245280Z  * [new branch]                gh/krocki/2/head        -> origin/gh/krocki/2/head
2025-12-04T11:11:09.6245376Z  * [new branch]                gh/krocki/2/orig        -> origin/gh/krocki/2/orig
2025-12-04T11:11:09.6245460Z  * [new branch]                gh/kurtamohler/60/base  -> origin/gh/kurtamohler/60/base
2025-12-04T11:11:09.6245566Z  * [new branch]                gh/kurtamohler/60/head  -> origin/gh/kurtamohler/60/head
2025-12-04T11:11:09.6245648Z  * [new branch]                gh/kurtamohler/60/orig  -> origin/gh/kurtamohler/60/orig
2025-12-04T11:11:09.6245726Z  * [new branch]                gh/kurtamohler/61/base  -> origin/gh/kurtamohler/61/base
2025-12-04T11:11:09.6245804Z  * [new branch]                gh/kurtamohler/61/head  -> origin/gh/kurtamohler/61/head
2025-12-04T11:11:09.6245884Z  * [new branch]                gh/kurtamohler/61/orig  -> origin/gh/kurtamohler/61/orig
2025-12-04T11:11:09.6245959Z  * [new branch]                gh/kurtamohler/62/base  -> origin/gh/kurtamohler/62/base
2025-12-04T11:11:09.6246035Z  * [new branch]                gh/kurtamohler/62/head  -> origin/gh/kurtamohler/62/head
2025-12-04T11:11:09.6246116Z  * [new branch]                gh/kurtamohler/62/orig  -> origin/gh/kurtamohler/62/orig
2025-12-04T11:11:09.6246194Z  * [new branch]                gh/kurtamohler/63/base  -> origin/gh/kurtamohler/63/base
2025-12-04T11:11:09.6246275Z  * [new branch]                gh/kurtamohler/63/head  -> origin/gh/kurtamohler/63/head
2025-12-04T11:11:09.6246351Z  * [new branch]                gh/kurtamohler/63/orig  -> origin/gh/kurtamohler/63/orig
2025-12-04T11:11:09.6246426Z  * [new branch]                gh/kurtamohler/64/base  -> origin/gh/kurtamohler/64/base
2025-12-04T11:11:09.6246505Z  * [new branch]                gh/kurtamohler/64/head  -> origin/gh/kurtamohler/64/head
2025-12-04T11:11:09.6246581Z  * [new branch]                gh/kurtamohler/64/orig  -> origin/gh/kurtamohler/64/orig
2025-12-04T11:11:09.6246656Z  * [new branch]                gh/kurtamohler/65/base  -> origin/gh/kurtamohler/65/base
2025-12-04T11:11:09.6246737Z  * [new branch]                gh/kurtamohler/65/head  -> origin/gh/kurtamohler/65/head
2025-12-04T11:11:09.6246812Z  * [new branch]                gh/kurtamohler/65/orig  -> origin/gh/kurtamohler/65/orig
2025-12-04T11:11:09.6246888Z  * [new branch]                gh/kurtamohler/66/base  -> origin/gh/kurtamohler/66/base
2025-12-04T11:11:09.6246972Z  * [new branch]                gh/kurtamohler/66/head  -> origin/gh/kurtamohler/66/head
2025-12-04T11:11:09.6247049Z  * [new branch]                gh/kurtamohler/66/orig  -> origin/gh/kurtamohler/66/orig
2025-12-04T11:11:09.6247125Z  * [new branch]                gh/kurtamohler/67/base  -> origin/gh/kurtamohler/67/base
2025-12-04T11:11:09.6247204Z  * [new branch]                gh/kurtamohler/67/head  -> origin/gh/kurtamohler/67/head
2025-12-04T11:11:09.6247280Z  * [new branch]                gh/kurtamohler/67/orig  -> origin/gh/kurtamohler/67/orig
2025-12-04T11:11:09.6247352Z  * [new branch]                gh/kwen2501/130/base    -> origin/gh/kwen2501/130/base
2025-12-04T11:11:09.6247428Z  * [new branch]                gh/kwen2501/130/head    -> origin/gh/kwen2501/130/head
2025-12-04T11:11:09.6247501Z  * [new branch]                gh/kwen2501/130/orig    -> origin/gh/kwen2501/130/orig
2025-12-04T11:11:09.6247572Z  * [new branch]                gh/kwen2501/170/base    -> origin/gh/kwen2501/170/base
2025-12-04T11:11:09.6247647Z  * [new branch]                gh/kwen2501/170/head    -> origin/gh/kwen2501/170/head
2025-12-04T11:11:09.6247716Z  * [new branch]                gh/kwen2501/187/base    -> origin/gh/kwen2501/187/base
2025-12-04T11:11:09.6247785Z  * [new branch]                gh/kwen2501/187/head    -> origin/gh/kwen2501/187/head
2025-12-04T11:11:09.6247858Z  * [new branch]                gh/kwen2501/187/orig    -> origin/gh/kwen2501/187/orig
2025-12-04T11:11:09.6247928Z  * [new branch]                gh/kwen2501/188/base    -> origin/gh/kwen2501/188/base
2025-12-04T11:11:09.6248002Z  * [new branch]                gh/kwen2501/188/head    -> origin/gh/kwen2501/188/head
2025-12-04T11:11:09.6248094Z  * [new branch]                gh/kwen2501/188/orig    -> origin/gh/kwen2501/188/orig
2025-12-04T11:11:09.6248200Z  * [new branch]                gh/kwen2501/211/base    -> origin/gh/kwen2501/211/base
2025-12-04T11:11:09.6248275Z  * [new branch]                gh/kwen2501/211/head    -> origin/gh/kwen2501/211/head
2025-12-04T11:11:09.6248382Z  * [new branch]                gh/kwen2501/224/base    -> origin/gh/kwen2501/224/base
2025-12-04T11:11:09.6248452Z  * [new branch]                gh/kwen2501/224/head    -> origin/gh/kwen2501/224/head
2025-12-04T11:11:09.6248526Z  * [new branch]                gh/kwen2501/224/orig    -> origin/gh/kwen2501/224/orig
2025-12-04T11:11:09.6248595Z  * [new branch]                gh/kwen2501/228/base    -> origin/gh/kwen2501/228/base
2025-12-04T11:11:09.6248666Z  * [new branch]                gh/kwen2501/228/head    -> origin/gh/kwen2501/228/head
2025-12-04T11:11:09.6248739Z  * [new branch]                gh/kwen2501/228/orig    -> origin/gh/kwen2501/228/orig
2025-12-04T11:11:09.6248811Z  * [new branch]                gh/kwen2501/234/base    -> origin/gh/kwen2501/234/base
2025-12-04T11:11:09.6248881Z  * [new branch]                gh/kwen2501/234/head    -> origin/gh/kwen2501/234/head
2025-12-04T11:11:09.6248955Z  * [new branch]                gh/kwen2501/234/orig    -> origin/gh/kwen2501/234/orig
2025-12-04T11:11:09.6249025Z  * [new branch]                gh/kwen2501/235/base    -> origin/gh/kwen2501/235/base
2025-12-04T11:11:09.6249095Z  * [new branch]                gh/kwen2501/235/head    -> origin/gh/kwen2501/235/head
2025-12-04T11:11:09.6249169Z  * [new branch]                gh/kwen2501/235/orig    -> origin/gh/kwen2501/235/orig
2025-12-04T11:11:09.6249239Z  * [new branch]                gh/kwen2501/236/base    -> origin/gh/kwen2501/236/base
2025-12-04T11:11:09.6249310Z  * [new branch]                gh/kwen2501/236/head    -> origin/gh/kwen2501/236/head
2025-12-04T11:11:09.6249385Z  * [new branch]                gh/kwen2501/236/orig    -> origin/gh/kwen2501/236/orig
2025-12-04T11:11:09.6249456Z  * [new branch]                gh/kwen2501/237/base    -> origin/gh/kwen2501/237/base
2025-12-04T11:11:09.6249529Z  * [new branch]                gh/kwen2501/237/head    -> origin/gh/kwen2501/237/head
2025-12-04T11:11:09.6249602Z  * [new branch]                gh/kwen2501/237/orig    -> origin/gh/kwen2501/237/orig
2025-12-04T11:11:09.6249672Z  * [new branch]                gh/kwen2501/238/base    -> origin/gh/kwen2501/238/base
2025-12-04T11:11:09.6249744Z  * [new branch]                gh/kwen2501/238/head    -> origin/gh/kwen2501/238/head
2025-12-04T11:11:09.6249815Z  * [new branch]                gh/kwen2501/238/orig    -> origin/gh/kwen2501/238/orig
2025-12-04T11:11:09.6249886Z  * [new branch]                gh/kwen2501/240/base    -> origin/gh/kwen2501/240/base
2025-12-04T11:11:09.6249960Z  * [new branch]                gh/kwen2501/240/head    -> origin/gh/kwen2501/240/head
2025-12-04T11:11:09.6250032Z  * [new branch]                gh/kwen2501/240/orig    -> origin/gh/kwen2501/240/orig
2025-12-04T11:11:09.6250105Z  * [new branch]                gh/kwen2501/241/base    -> origin/gh/kwen2501/241/base
2025-12-04T11:11:09.6250179Z  * [new branch]                gh/kwen2501/241/head    -> origin/gh/kwen2501/241/head
2025-12-04T11:11:09.6250250Z  * [new branch]                gh/kwen2501/241/orig    -> origin/gh/kwen2501/241/orig
2025-12-04T11:11:09.6250321Z  * [new branch]                gh/kwen2501/247/base    -> origin/gh/kwen2501/247/base
2025-12-04T11:11:09.6250396Z  * [new branch]                gh/kwen2501/247/head    -> origin/gh/kwen2501/247/head
2025-12-04T11:11:09.6250468Z  * [new branch]                gh/kwen2501/247/orig    -> origin/gh/kwen2501/247/orig
2025-12-04T11:11:09.6250539Z  * [new branch]                gh/kwen2501/252/base    -> origin/gh/kwen2501/252/base
2025-12-04T11:11:09.6250613Z  * [new branch]                gh/kwen2501/252/head    -> origin/gh/kwen2501/252/head
2025-12-04T11:11:09.6250684Z  * [new branch]                gh/kwen2501/252/orig    -> origin/gh/kwen2501/252/orig
2025-12-04T11:11:09.6250782Z  * [new branch]                gh/kwen2501/259/base    -> origin/gh/kwen2501/259/base
2025-12-04T11:11:09.6250857Z  * [new branch]                gh/kwen2501/259/head    -> origin/gh/kwen2501/259/head
2025-12-04T11:11:09.6250948Z  * [new branch]                gh/kwen2501/259/orig    -> origin/gh/kwen2501/259/orig
2025-12-04T11:11:09.6251022Z  * [new branch]                gh/kwen2501/260/base    -> origin/gh/kwen2501/260/base
2025-12-04T11:11:09.6251094Z  * [new branch]                gh/kwen2501/260/head    -> origin/gh/kwen2501/260/head
2025-12-04T11:11:09.6251165Z  * [new branch]                gh/kwen2501/260/orig    -> origin/gh/kwen2501/260/orig
2025-12-04T11:11:09.6251238Z  * [new branch]                gh/kwen2501/268/base    -> origin/gh/kwen2501/268/base
2025-12-04T11:11:09.6251309Z  * [new branch]                gh/kwen2501/268/head    -> origin/gh/kwen2501/268/head
2025-12-04T11:11:09.6251378Z  * [new branch]                gh/kwen2501/268/orig    -> origin/gh/kwen2501/268/orig
2025-12-04T11:11:09.6251451Z  * [new branch]                gh/kwen2501/269/base    -> origin/gh/kwen2501/269/base
2025-12-04T11:11:09.6251522Z  * [new branch]                gh/kwen2501/269/head    -> origin/gh/kwen2501/269/head
2025-12-04T11:11:09.6251593Z  * [new branch]                gh/kwen2501/269/orig    -> origin/gh/kwen2501/269/orig
2025-12-04T11:11:09.6251668Z  * [new branch]                gh/kwen2501/270/base    -> origin/gh/kwen2501/270/base
2025-12-04T11:11:09.6251738Z  * [new branch]                gh/kwen2501/270/head    -> origin/gh/kwen2501/270/head
2025-12-04T11:11:09.6251808Z  * [new branch]                gh/kwen2501/270/orig    -> origin/gh/kwen2501/270/orig
2025-12-04T11:11:09.6251882Z  * [new branch]                gh/kwen2501/271/base    -> origin/gh/kwen2501/271/base
2025-12-04T11:11:09.6251953Z  * [new branch]                gh/kwen2501/271/head    -> origin/gh/kwen2501/271/head
2025-12-04T11:11:09.6252024Z  * [new branch]                gh/kwen2501/271/orig    -> origin/gh/kwen2501/271/orig
2025-12-04T11:11:09.6252100Z  * [new branch]                gh/kwen2501/274/base    -> origin/gh/kwen2501/274/base
2025-12-04T11:11:09.6252172Z  * [new branch]                gh/kwen2501/274/head    -> origin/gh/kwen2501/274/head
2025-12-04T11:11:09.6252244Z  * [new branch]                gh/kwen2501/274/orig    -> origin/gh/kwen2501/274/orig
2025-12-04T11:11:09.6252320Z  * [new branch]                gh/kwen2501/275/base    -> origin/gh/kwen2501/275/base
2025-12-04T11:11:09.6252391Z  * [new branch]                gh/kwen2501/275/head    -> origin/gh/kwen2501/275/head
2025-12-04T11:11:09.6252462Z  * [new branch]                gh/kwen2501/275/orig    -> origin/gh/kwen2501/275/orig
2025-12-04T11:11:09.6252536Z  * [new branch]                gh/kwen2501/276/base    -> origin/gh/kwen2501/276/base
2025-12-04T11:11:09.6252607Z  * [new branch]                gh/kwen2501/276/head    -> origin/gh/kwen2501/276/head
2025-12-04T11:11:09.6252681Z  * [new branch]                gh/kwen2501/276/orig    -> origin/gh/kwen2501/276/orig
2025-12-04T11:11:09.6252752Z  * [new branch]                gh/kwen2501/277/base    -> origin/gh/kwen2501/277/base
2025-12-04T11:11:09.6252825Z  * [new branch]                gh/kwen2501/277/head    -> origin/gh/kwen2501/277/head
2025-12-04T11:11:09.6252902Z  * [new branch]                gh/kwen2501/277/orig    -> origin/gh/kwen2501/277/orig
2025-12-04T11:11:09.6252972Z  * [new branch]                gh/kwen2501/278/base    -> origin/gh/kwen2501/278/base
2025-12-04T11:11:09.6253044Z  * [new branch]                gh/kwen2501/278/head    -> origin/gh/kwen2501/278/head
2025-12-04T11:11:09.6253120Z  * [new branch]                gh/kwen2501/278/orig    -> origin/gh/kwen2501/278/orig
2025-12-04T11:11:09.6253191Z  * [new branch]                gh/kwen2501/279/base    -> origin/gh/kwen2501/279/base
2025-12-04T11:11:09.6253263Z  * [new branch]                gh/kwen2501/279/head    -> origin/gh/kwen2501/279/head
2025-12-04T11:11:09.6253368Z  * [new branch]                gh/kwen2501/279/orig    -> origin/gh/kwen2501/279/orig
2025-12-04T11:11:09.6253439Z  * [new branch]                gh/kwen2501/280/base    -> origin/gh/kwen2501/280/base
2025-12-04T11:11:09.6253511Z  * [new branch]                gh/kwen2501/280/head    -> origin/gh/kwen2501/280/head
2025-12-04T11:11:09.6253612Z  * [new branch]                gh/kwen2501/280/orig    -> origin/gh/kwen2501/280/orig
2025-12-04T11:11:09.6253683Z  * [new branch]                gh/kwen2501/281/base    -> origin/gh/kwen2501/281/base
2025-12-04T11:11:09.6253753Z  * [new branch]                gh/kwen2501/281/head    -> origin/gh/kwen2501/281/head
2025-12-04T11:11:09.6253827Z  * [new branch]                gh/kwen2501/281/orig    -> origin/gh/kwen2501/281/orig
2025-12-04T11:11:09.6253899Z  * [new branch]                gh/kwen2501/282/base    -> origin/gh/kwen2501/282/base
2025-12-04T11:11:09.6253970Z  * [new branch]                gh/kwen2501/282/head    -> origin/gh/kwen2501/282/head
2025-12-04T11:11:09.6254046Z  * [new branch]                gh/kwen2501/282/orig    -> origin/gh/kwen2501/282/orig
2025-12-04T11:11:09.6254118Z  * [new branch]                gh/kwen2501/283/base    -> origin/gh/kwen2501/283/base
2025-12-04T11:11:09.6254192Z  * [new branch]                gh/kwen2501/283/head    -> origin/gh/kwen2501/283/head
2025-12-04T11:11:09.6254264Z  * [new branch]                gh/kwen2501/283/orig    -> origin/gh/kwen2501/283/orig
2025-12-04T11:11:09.6254335Z  * [new branch]                gh/kwen2501/284/base    -> origin/gh/kwen2501/284/base
2025-12-04T11:11:09.6254410Z  * [new branch]                gh/kwen2501/284/head    -> origin/gh/kwen2501/284/head
2025-12-04T11:11:09.6254482Z  * [new branch]                gh/kwen2501/284/orig    -> origin/gh/kwen2501/284/orig
2025-12-04T11:11:09.6254553Z  * [new branch]                gh/kwen2501/285/base    -> origin/gh/kwen2501/285/base
2025-12-04T11:11:09.6254627Z  * [new branch]                gh/kwen2501/285/head    -> origin/gh/kwen2501/285/head
2025-12-04T11:11:09.6254699Z  * [new branch]                gh/kwen2501/285/orig    -> origin/gh/kwen2501/285/orig
2025-12-04T11:11:09.6254770Z  * [new branch]                gh/kwen2501/286/base    -> origin/gh/kwen2501/286/base
2025-12-04T11:11:09.6254841Z  * [new branch]                gh/kwen2501/286/head    -> origin/gh/kwen2501/286/head
2025-12-04T11:11:09.6254913Z  * [new branch]                gh/kwen2501/286/orig    -> origin/gh/kwen2501/286/orig
2025-12-04T11:11:09.6254984Z  * [new branch]                gh/kwen2501/287/base    -> origin/gh/kwen2501/287/base
2025-12-04T11:11:09.6255059Z  * [new branch]                gh/kwen2501/287/head    -> origin/gh/kwen2501/287/head
2025-12-04T11:11:09.6255131Z  * [new branch]                gh/kwen2501/287/orig    -> origin/gh/kwen2501/287/orig
2025-12-04T11:11:09.6255202Z  * [new branch]                gh/kwen2501/288/base    -> origin/gh/kwen2501/288/base
2025-12-04T11:11:09.6255277Z  * [new branch]                gh/kwen2501/288/head    -> origin/gh/kwen2501/288/head
2025-12-04T11:11:09.6255349Z  * [new branch]                gh/kwen2501/288/orig    -> origin/gh/kwen2501/288/orig
2025-12-04T11:11:09.6255427Z  * [new branch]                gh/laithsakka/251/base  -> origin/gh/laithsakka/251/base
2025-12-04T11:11:09.6255508Z  * [new branch]                gh/laithsakka/251/head  -> origin/gh/laithsakka/251/head
2025-12-04T11:11:09.6255585Z  * [new branch]                gh/laithsakka/251/orig  -> origin/gh/laithsakka/251/orig
2025-12-04T11:11:09.6255663Z  * [new branch]                gh/laithsakka/276/base  -> origin/gh/laithsakka/276/base
2025-12-04T11:11:09.6255738Z  * [new branch]                gh/laithsakka/276/head  -> origin/gh/laithsakka/276/head
2025-12-04T11:11:09.6255815Z  * [new branch]                gh/laithsakka/276/orig  -> origin/gh/laithsakka/276/orig
2025-12-04T11:11:09.6255893Z  * [new branch]                gh/laithsakka/28/base   -> origin/gh/laithsakka/28/base
2025-12-04T11:11:09.6255969Z  * [new branch]                gh/laithsakka/29/base   -> origin/gh/laithsakka/29/base
2025-12-04T11:11:09.6256066Z  * [new branch]                gh/laithsakka/30/base   -> origin/gh/laithsakka/30/base
2025-12-04T11:11:09.6256146Z  * [new branch]                gh/laithsakka/30/head   -> origin/gh/laithsakka/30/head
2025-12-04T11:11:09.6256240Z  * [new branch]                gh/laithsakka/31/base   -> origin/gh/laithsakka/31/base
2025-12-04T11:11:09.6256315Z  * [new branch]                gh/laithsakka/31/head   -> origin/gh/laithsakka/31/head
2025-12-04T11:11:09.6256397Z  * [new branch]                gh/laithsakka/313/base  -> origin/gh/laithsakka/313/base
2025-12-04T11:11:09.6256473Z  * [new branch]                gh/laithsakka/313/head  -> origin/gh/laithsakka/313/head
2025-12-04T11:11:09.6256548Z  * [new branch]                gh/laithsakka/313/orig  -> origin/gh/laithsakka/313/orig
2025-12-04T11:11:09.6256629Z  * [new branch]                gh/laithsakka/316/base  -> origin/gh/laithsakka/316/base
2025-12-04T11:11:09.6256703Z  * [new branch]                gh/laithsakka/316/head  -> origin/gh/laithsakka/316/head
2025-12-04T11:11:09.6256780Z  * [new branch]                gh/laithsakka/316/orig  -> origin/gh/laithsakka/316/orig
2025-12-04T11:11:09.6256859Z  * [new branch]                gh/laithsakka/317/base  -> origin/gh/laithsakka/317/base
2025-12-04T11:11:09.6256936Z  * [new branch]                gh/laithsakka/317/head  -> origin/gh/laithsakka/317/head
2025-12-04T11:11:09.6257011Z  * [new branch]                gh/laithsakka/317/orig  -> origin/gh/laithsakka/317/orig
2025-12-04T11:11:09.6257088Z  * [new branch]                gh/laithsakka/319/base  -> origin/gh/laithsakka/319/base
2025-12-04T11:11:09.6257164Z  * [new branch]                gh/laithsakka/319/head  -> origin/gh/laithsakka/319/head
2025-12-04T11:11:09.6257244Z  * [new branch]                gh/laithsakka/319/orig  -> origin/gh/laithsakka/319/orig
2025-12-04T11:11:09.6257319Z  * [new branch]                gh/laithsakka/32/base   -> origin/gh/laithsakka/32/base
2025-12-04T11:11:09.6257397Z  * [new branch]                gh/laithsakka/32/head   -> origin/gh/laithsakka/32/head
2025-12-04T11:11:09.6257477Z  * [new branch]                gh/laithsakka/320/base  -> origin/gh/laithsakka/320/base
2025-12-04T11:11:09.6257552Z  * [new branch]                gh/laithsakka/320/head  -> origin/gh/laithsakka/320/head
2025-12-04T11:11:09.6257629Z  * [new branch]                gh/laithsakka/320/orig  -> origin/gh/laithsakka/320/orig
2025-12-04T11:11:09.6257708Z  * [new branch]                gh/laithsakka/321/base  -> origin/gh/laithsakka/321/base
2025-12-04T11:11:09.6257784Z  * [new branch]                gh/laithsakka/321/head  -> origin/gh/laithsakka/321/head
2025-12-04T11:11:09.6257859Z  * [new branch]                gh/laithsakka/321/orig  -> origin/gh/laithsakka/321/orig
2025-12-04T11:11:09.6257937Z  * [new branch]                gh/laithsakka/322/base  -> origin/gh/laithsakka/322/base
2025-12-04T11:11:09.6258012Z  * [new branch]                gh/laithsakka/322/head  -> origin/gh/laithsakka/322/head
2025-12-04T11:11:09.6258089Z  * [new branch]                gh/laithsakka/322/orig  -> origin/gh/laithsakka/322/orig
2025-12-04T11:11:09.6258211Z  * [new branch]                gh/laithsakka/323/base  -> origin/gh/laithsakka/323/base
2025-12-04T11:11:09.6258287Z  * [new branch]                gh/laithsakka/323/head  -> origin/gh/laithsakka/323/head
2025-12-04T11:11:09.6258363Z  * [new branch]                gh/laithsakka/323/orig  -> origin/gh/laithsakka/323/orig
2025-12-04T11:11:09.6258440Z  * [new branch]                gh/laithsakka/324/base  -> origin/gh/laithsakka/324/base
2025-12-04T11:11:09.6258515Z  * [new branch]                gh/laithsakka/324/head  -> origin/gh/laithsakka/324/head
2025-12-04T11:11:09.6258589Z  * [new branch]                gh/laithsakka/324/orig  -> origin/gh/laithsakka/324/orig
2025-12-04T11:11:09.6258664Z  * [new branch]                gh/laithsakka/325/base  -> origin/gh/laithsakka/325/base
2025-12-04T11:11:09.6258739Z  * [new branch]                gh/laithsakka/325/head  -> origin/gh/laithsakka/325/head
2025-12-04T11:11:09.6258840Z  * [new branch]                gh/laithsakka/325/orig  -> origin/gh/laithsakka/325/orig
2025-12-04T11:11:09.6258919Z  * [new branch]                gh/laithsakka/326/base  -> origin/gh/laithsakka/326/base
2025-12-04T11:11:09.6259024Z  * [new branch]                gh/laithsakka/326/head  -> origin/gh/laithsakka/326/head
2025-12-04T11:11:09.6259102Z  * [new branch]                gh/laithsakka/326/orig  -> origin/gh/laithsakka/326/orig
2025-12-04T11:11:09.6259176Z  * [new branch]                gh/laithsakka/327/base  -> origin/gh/laithsakka/327/base
2025-12-04T11:11:09.6259251Z  * [new branch]                gh/laithsakka/327/head  -> origin/gh/laithsakka/327/head
2025-12-04T11:11:09.6259330Z  * [new branch]                gh/laithsakka/327/orig  -> origin/gh/laithsakka/327/orig
2025-12-04T11:11:09.6259404Z  * [new branch]                gh/laithsakka/328/base  -> origin/gh/laithsakka/328/base
2025-12-04T11:11:09.6259478Z  * [new branch]                gh/laithsakka/328/head  -> origin/gh/laithsakka/328/head
2025-12-04T11:11:09.6259558Z  * [new branch]                gh/laithsakka/328/orig  -> origin/gh/laithsakka/328/orig
2025-12-04T11:11:09.6259630Z  * [new branch]                gh/liangel/4/base       -> origin/gh/liangel/4/base
2025-12-04T11:11:09.6259704Z  * [new branch]                gh/liangel/4/head       -> origin/gh/liangel/4/head
2025-12-04T11:11:09.6259777Z  * [new branch]                gh/liangel/4/orig       -> origin/gh/liangel/4/orig
2025-12-04T11:11:09.6259856Z  * [new branch]                gh/lucaskabela/1/base   -> origin/gh/lucaskabela/1/base
2025-12-04T11:11:09.6259935Z  * [new branch]                gh/lucaskabela/1/head   -> origin/gh/lucaskabela/1/head
2025-12-04T11:11:09.6260007Z  * [new branch]                gh/lw/4/base            -> origin/gh/lw/4/base
2025-12-04T11:11:09.6260073Z  * [new branch]                gh/lw/4/head            -> origin/gh/lw/4/head
2025-12-04T11:11:09.6260138Z  * [new branch]                gh/lw/4/orig            -> origin/gh/lw/4/orig
2025-12-04T11:11:09.6260206Z  * [new branch]                gh/lw/5/base            -> origin/gh/lw/5/base
2025-12-04T11:11:09.6260270Z  * [new branch]                gh/lw/5/head            -> origin/gh/lw/5/head
2025-12-04T11:11:09.6260332Z  * [new branch]                gh/lw/5/orig            -> origin/gh/lw/5/orig
2025-12-04T11:11:09.6260399Z  * [new branch]                gh/lw/6/base            -> origin/gh/lw/6/base
2025-12-04T11:11:09.6260464Z  * [new branch]                gh/lw/6/head            -> origin/gh/lw/6/head
2025-12-04T11:11:09.6260528Z  * [new branch]                gh/lw/6/orig            -> origin/gh/lw/6/orig
2025-12-04T11:11:09.6260601Z  * [new branch]                gh/malfet/14/base       -> origin/gh/malfet/14/base
2025-12-04T11:11:09.6260674Z  * [new branch]                gh/malfet/417/base      -> origin/gh/malfet/417/base
2025-12-04T11:11:09.6260749Z  * [new branch]                gh/malfet/417/head      -> origin/gh/malfet/417/head
2025-12-04T11:11:09.6260823Z  * [new branch]                gh/malfet/417/orig      -> origin/gh/malfet/417/orig
2025-12-04T11:11:09.6260893Z  * [new branch]                gh/malfet/506/base      -> origin/gh/malfet/506/base
2025-12-04T11:11:09.6260967Z  * [new branch]                gh/malfet/506/head      -> origin/gh/malfet/506/head
2025-12-04T11:11:09.6261037Z  * [new branch]                gh/malfet/506/orig      -> origin/gh/malfet/506/orig
2025-12-04T11:11:09.6261107Z  * [new branch]                gh/malfet/517/base      -> origin/gh/malfet/517/base
2025-12-04T11:11:09.6261180Z  * [new branch]                gh/malfet/517/head      -> origin/gh/malfet/517/head
2025-12-04T11:11:09.6261251Z  * [new branch]                gh/malfet/528/base      -> origin/gh/malfet/528/base
2025-12-04T11:11:09.6261320Z  * [new branch]                gh/malfet/528/head      -> origin/gh/malfet/528/head
2025-12-04T11:11:09.6261392Z  * [new branch]                gh/malfet/528/orig      -> origin/gh/malfet/528/orig
2025-12-04T11:11:09.6261870Z  * [new branch]                gh/malfet/537/base      -> origin/gh/malfet/537/base
2025-12-04T11:11:09.6261943Z  * [new branch]                gh/malfet/537/head      -> origin/gh/malfet/537/head
2025-12-04T11:11:09.6262017Z  * [new branch]                gh/malfet/537/orig      -> origin/gh/malfet/537/orig
2025-12-04T11:11:09.6262107Z  * [new branch]                gh/malfet/546/base      -> origin/gh/malfet/546/base
2025-12-04T11:11:09.6262177Z  * [new branch]                gh/malfet/546/head      -> origin/gh/malfet/546/head
2025-12-04T11:11:09.6262249Z  * [new branch]                gh/malfet/546/orig      -> origin/gh/malfet/546/orig
2025-12-04T11:11:09.6262319Z  * [new branch]                gh/malfet/565/base      -> origin/gh/malfet/565/base
2025-12-04T11:11:09.6262387Z  * [new branch]                gh/malfet/565/head      -> origin/gh/malfet/565/head
2025-12-04T11:11:09.6262460Z  * [new branch]                gh/malfet/565/orig      -> origin/gh/malfet/565/orig
2025-12-04T11:11:09.6262531Z  * [new branch]                gh/malfet/575/base      -> origin/gh/malfet/575/base
2025-12-04T11:11:09.6262600Z  * [new branch]                gh/malfet/575/head      -> origin/gh/malfet/575/head
2025-12-04T11:11:09.6262674Z  * [new branch]                gh/malfet/575/orig      -> origin/gh/malfet/575/orig
2025-12-04T11:11:09.6262745Z  * [new branch]                gh/malfet/580/base      -> origin/gh/malfet/580/base
2025-12-04T11:11:09.6262818Z  * [new branch]                gh/malfet/580/head      -> origin/gh/malfet/580/head
2025-12-04T11:11:09.6262887Z  * [new branch]                gh/malfet/580/orig      -> origin/gh/malfet/580/orig
2025-12-04T11:11:09.6262955Z  * [new branch]                gh/malfet/581/base      -> origin/gh/malfet/581/base
2025-12-04T11:11:09.6263027Z  * [new branch]                gh/malfet/581/head      -> origin/gh/malfet/581/head
2025-12-04T11:11:09.6263097Z  * [new branch]                gh/malfet/581/orig      -> origin/gh/malfet/581/orig
2025-12-04T11:11:09.6263167Z  * [new branch]                gh/malfet/583/base      -> origin/gh/malfet/583/base
2025-12-04T11:11:09.6263239Z  * [new branch]                gh/malfet/583/head      -> origin/gh/malfet/583/head
2025-12-04T11:11:09.6263308Z  * [new branch]                gh/malfet/583/orig      -> origin/gh/malfet/583/orig
2025-12-04T11:11:09.6263379Z  * [new branch]                gh/malfet/586/base      -> origin/gh/malfet/586/base
2025-12-04T11:11:09.6263452Z  * [new branch]                gh/malfet/586/head      -> origin/gh/malfet/586/head
2025-12-04T11:11:09.6263520Z  * [new branch]                gh/malfet/586/orig      -> origin/gh/malfet/586/orig
2025-12-04T11:11:09.6263589Z  * [new branch]                gh/malfet/587/base      -> origin/gh/malfet/587/base
2025-12-04T11:11:09.6263662Z  * [new branch]                gh/malfet/587/head      -> origin/gh/malfet/587/head
2025-12-04T11:11:09.6263730Z  * [new branch]                gh/malfet/587/orig      -> origin/gh/malfet/587/orig
2025-12-04T11:11:09.6263799Z  * [new branch]                gh/malfet/588/base      -> origin/gh/malfet/588/base
2025-12-04T11:11:09.6263874Z  * [new branch]                gh/malfet/588/head      -> origin/gh/malfet/588/head
2025-12-04T11:11:09.6263944Z  * [new branch]                gh/malfet/588/orig      -> origin/gh/malfet/588/orig
2025-12-04T11:11:09.6264015Z  * [new branch]                gh/malfet/589/base      -> origin/gh/malfet/589/base
2025-12-04T11:11:09.6264088Z  * [new branch]                gh/malfet/589/head      -> origin/gh/malfet/589/head
2025-12-04T11:11:09.6264158Z  * [new branch]                gh/malfet/589/orig      -> origin/gh/malfet/589/orig
2025-12-04T11:11:09.6264228Z  * [new branch]                gh/malfet/590/base      -> origin/gh/malfet/590/base
2025-12-04T11:11:09.6264301Z  * [new branch]                gh/malfet/590/head      -> origin/gh/malfet/590/head
2025-12-04T11:11:09.6264370Z  * [new branch]                gh/malfet/590/orig      -> origin/gh/malfet/590/orig
2025-12-04T11:11:09.6264440Z  * [new branch]                gh/malfet/591/base      -> origin/gh/malfet/591/base
2025-12-04T11:11:09.6264538Z  * [new branch]                gh/malfet/591/head      -> origin/gh/malfet/591/head
2025-12-04T11:11:09.6264608Z  * [new branch]                gh/malfet/591/orig      -> origin/gh/malfet/591/orig
2025-12-04T11:11:09.6264703Z  * [new branch]                gh/malfet/592/base      -> origin/gh/malfet/592/base
2025-12-04T11:11:09.6264773Z  * [new branch]                gh/malfet/592/head      -> origin/gh/malfet/592/head
2025-12-04T11:11:09.6264842Z  * [new branch]                gh/malfet/592/orig      -> origin/gh/malfet/592/orig
2025-12-04T11:11:09.6264914Z  * [new branch]                gh/malfet/593/base      -> origin/gh/malfet/593/base
2025-12-04T11:11:09.6264983Z  * [new branch]                gh/malfet/593/head      -> origin/gh/malfet/593/head
2025-12-04T11:11:09.6265051Z  * [new branch]                gh/malfet/593/orig      -> origin/gh/malfet/593/orig
2025-12-04T11:11:09.6265126Z  * [new branch]                gh/malfet/594/base      -> origin/gh/malfet/594/base
2025-12-04T11:11:09.6265196Z  * [new branch]                gh/malfet/594/head      -> origin/gh/malfet/594/head
2025-12-04T11:11:09.6265265Z  * [new branch]                gh/malfet/594/orig      -> origin/gh/malfet/594/orig
2025-12-04T11:11:09.6265338Z  * [new branch]                gh/malfet/595/base      -> origin/gh/malfet/595/base
2025-12-04T11:11:09.6265407Z  * [new branch]                gh/malfet/595/head      -> origin/gh/malfet/595/head
2025-12-04T11:11:09.6265476Z  * [new branch]                gh/malfet/595/orig      -> origin/gh/malfet/595/orig
2025-12-04T11:11:09.6265549Z  * [new branch]                gh/malfet/596/base      -> origin/gh/malfet/596/base
2025-12-04T11:11:09.6265620Z  * [new branch]                gh/malfet/596/head      -> origin/gh/malfet/596/head
2025-12-04T11:11:09.6265688Z  * [new branch]                gh/malfet/596/orig      -> origin/gh/malfet/596/orig
2025-12-04T11:11:09.6265760Z  * [new branch]                gh/malfet/597/base      -> origin/gh/malfet/597/base
2025-12-04T11:11:09.6265831Z  * [new branch]                gh/malfet/597/head      -> origin/gh/malfet/597/head
2025-12-04T11:11:09.6265900Z  * [new branch]                gh/malfet/597/orig      -> origin/gh/malfet/597/orig
2025-12-04T11:11:09.6265974Z  * [new branch]                gh/malfet/598/base      -> origin/gh/malfet/598/base
2025-12-04T11:11:09.6266044Z  * [new branch]                gh/malfet/598/head      -> origin/gh/malfet/598/head
2025-12-04T11:11:09.6266113Z  * [new branch]                gh/malfet/598/orig      -> origin/gh/malfet/598/orig
2025-12-04T11:11:09.6266185Z  * [new branch]                gh/malfet/599/base      -> origin/gh/malfet/599/base
2025-12-04T11:11:09.6266253Z  * [new branch]                gh/malfet/599/head      -> origin/gh/malfet/599/head
2025-12-04T11:11:09.6266326Z  * [new branch]                gh/malfet/599/orig      -> origin/gh/malfet/599/orig
2025-12-04T11:11:09.6266396Z  * [new branch]                gh/malfet/600/base      -> origin/gh/malfet/600/base
2025-12-04T11:11:09.6266467Z  * [new branch]                gh/malfet/600/head      -> origin/gh/malfet/600/head
2025-12-04T11:11:09.6266539Z  * [new branch]                gh/malfet/600/orig      -> origin/gh/malfet/600/orig
2025-12-04T11:11:09.6266610Z  * [new branch]                gh/malfet/601/base      -> origin/gh/malfet/601/base
2025-12-04T11:11:09.6266680Z  * [new branch]                gh/malfet/601/head      -> origin/gh/malfet/601/head
2025-12-04T11:11:09.6266753Z  * [new branch]                gh/malfet/601/orig      -> origin/gh/malfet/601/orig
2025-12-04T11:11:09.6266822Z  * [new branch]                gh/malfet/602/base      -> origin/gh/malfet/602/base
2025-12-04T11:11:09.6266892Z  * [new branch]                gh/malfet/602/head      -> origin/gh/malfet/602/head
2025-12-04T11:11:09.6266968Z  * [new branch]                gh/malfet/602/orig      -> origin/gh/malfet/602/orig
2025-12-04T11:11:09.6267037Z  * [new branch]                gh/malfet/603/base      -> origin/gh/malfet/603/base
2025-12-04T11:11:09.6267128Z  * [new branch]                gh/malfet/603/head      -> origin/gh/malfet/603/head
2025-12-04T11:11:09.6278363Z  * [new branch]                gh/malfet/603/orig      -> origin/gh/malfet/603/orig
2025-12-04T11:11:09.6278508Z  * [new branch]                gh/malfet/604/base      -> origin/gh/malfet/604/base
2025-12-04T11:11:09.6278584Z  * [new branch]                gh/malfet/604/head      -> origin/gh/malfet/604/head
2025-12-04T11:11:09.6278663Z  * [new branch]                gh/malfet/604/orig      -> origin/gh/malfet/604/orig
2025-12-04T11:11:09.6278734Z  * [new branch]                gh/malfet/605/base      -> origin/gh/malfet/605/base
2025-12-04T11:11:09.6278805Z  * [new branch]                gh/malfet/605/head      -> origin/gh/malfet/605/head
2025-12-04T11:11:09.6278884Z  * [new branch]                gh/malfet/605/orig      -> origin/gh/malfet/605/orig
2025-12-04T11:11:09.6278958Z  * [new branch]                gh/malfet/606/base      -> origin/gh/malfet/606/base
2025-12-04T11:11:09.6279032Z  * [new branch]                gh/malfet/606/head      -> origin/gh/malfet/606/head
2025-12-04T11:11:09.6279109Z  * [new branch]                gh/malfet/606/orig      -> origin/gh/malfet/606/orig
2025-12-04T11:11:09.6279185Z  * [new branch]                gh/malfet/607/base      -> origin/gh/malfet/607/base
2025-12-04T11:11:09.6279255Z  * [new branch]                gh/malfet/607/head      -> origin/gh/malfet/607/head
2025-12-04T11:11:09.6279335Z  * [new branch]                gh/malfet/607/orig      -> origin/gh/malfet/607/orig
2025-12-04T11:11:09.6279406Z  * [new branch]                gh/malfet/608/base      -> origin/gh/malfet/608/base
2025-12-04T11:11:09.6279485Z  * [new branch]                gh/malfet/608/head      -> origin/gh/malfet/608/head
2025-12-04T11:11:09.6279557Z  * [new branch]                gh/malfet/608/orig      -> origin/gh/malfet/608/orig
2025-12-04T11:11:09.6279627Z  * [new branch]                gh/malfet/609/base      -> origin/gh/malfet/609/base
2025-12-04T11:11:09.6279702Z  * [new branch]                gh/malfet/609/head      -> origin/gh/malfet/609/head
2025-12-04T11:11:09.6279775Z  * [new branch]                gh/malfet/609/orig      -> origin/gh/malfet/609/orig
2025-12-04T11:11:09.6279855Z  * [new branch]                gh/malfet/610/base      -> origin/gh/malfet/610/base
2025-12-04T11:11:09.6279930Z  * [new branch]                gh/malfet/610/head      -> origin/gh/malfet/610/head
2025-12-04T11:11:09.6280003Z  * [new branch]                gh/malfet/610/orig      -> origin/gh/malfet/610/orig
2025-12-04T11:11:09.6280074Z  * [new branch]                gh/malfet/611/base      -> origin/gh/malfet/611/base
2025-12-04T11:11:09.6280149Z  * [new branch]                gh/malfet/611/head      -> origin/gh/malfet/611/head
2025-12-04T11:11:09.6280220Z  * [new branch]                gh/malfet/611/orig      -> origin/gh/malfet/611/orig
2025-12-04T11:11:09.6280290Z  * [new branch]                gh/malfet/612/base      -> origin/gh/malfet/612/base
2025-12-04T11:11:09.6280367Z  * [new branch]                gh/malfet/612/head      -> origin/gh/malfet/612/head
2025-12-04T11:11:09.6280438Z  * [new branch]                gh/malfet/612/orig      -> origin/gh/malfet/612/orig
2025-12-04T11:11:09.6280513Z  * [new branch]                gh/malfet/64/base       -> origin/gh/malfet/64/base
2025-12-04T11:11:09.6280596Z  * [new branch]                gh/malfet/64/head       -> origin/gh/malfet/64/head
2025-12-04T11:11:09.6280692Z  * [new branch]                gh/manuelcandales/11/base -> origin/gh/manuelcandales/11/base
2025-12-04T11:11:09.6280781Z  * [new branch]                gh/manuelcandales/11/head -> origin/gh/manuelcandales/11/head
2025-12-04T11:11:09.6280872Z  * [new branch]                gh/manuelcandales/11/orig -> origin/gh/manuelcandales/11/orig
2025-12-04T11:11:09.6280949Z  * [new branch]                gh/markkm/1/base        -> origin/gh/markkm/1/base
2025-12-04T11:11:09.6281027Z  * [new branch]                gh/masnesral/1/base     -> origin/gh/masnesral/1/base
2025-12-04T11:11:09.6281139Z  * [new branch]                gh/masnesral/1/head     -> origin/gh/masnesral/1/head
2025-12-04T11:11:09.6281217Z  * [new branch]                gh/masnesral/1/orig     -> origin/gh/masnesral/1/orig
2025-12-04T11:11:09.6281321Z  * [new branch]                gh/mhorowitz/0/base     -> origin/gh/mhorowitz/0/base
2025-12-04T11:11:09.6281395Z  * [new branch]                gh/mhorowitz/0/head     -> origin/gh/mhorowitz/0/head
2025-12-04T11:11:09.6281469Z  * [new branch]                gh/mhorowitz/1/base     -> origin/gh/mhorowitz/1/base
2025-12-04T11:11:09.6281546Z  * [new branch]                gh/mhorowitz/1/head     -> origin/gh/mhorowitz/1/head
2025-12-04T11:11:09.6281619Z  * [new branch]                gh/mhorowitz/2/base     -> origin/gh/mhorowitz/2/base
2025-12-04T11:11:09.6281693Z  * [new branch]                gh/mhorowitz/2/head     -> origin/gh/mhorowitz/2/head
2025-12-04T11:11:09.6281771Z  * [new branch]                gh/mhorowitz/3/base     -> origin/gh/mhorowitz/3/base
2025-12-04T11:11:09.6281846Z  * [new branch]                gh/mhorowitz/3/head     -> origin/gh/mhorowitz/3/head
2025-12-04T11:11:09.6281920Z  * [new branch]                gh/mhorowitz/4/base     -> origin/gh/mhorowitz/4/base
2025-12-04T11:11:09.6282001Z  * [new branch]                gh/mhorowitz/4/head     -> origin/gh/mhorowitz/4/head
2025-12-04T11:11:09.6282075Z  * [new branch]                gh/mhorowitz/5/base     -> origin/gh/mhorowitz/5/base
2025-12-04T11:11:09.6282149Z  * [new branch]                gh/mhorowitz/5/head     -> origin/gh/mhorowitz/5/head
2025-12-04T11:11:09.6282227Z  * [new branch]                gh/mhorowitz/6/base     -> origin/gh/mhorowitz/6/base
2025-12-04T11:11:09.6282302Z  * [new branch]                gh/mhorowitz/6/head     -> origin/gh/mhorowitz/6/head
2025-12-04T11:11:09.6282410Z  * [new branch]                gh/mikaylagawarecki/234/base -> origin/gh/mikaylagawarecki/234/base
2025-12-04T11:11:09.6282517Z  * [new branch]                gh/mikaylagawarecki/234/head -> origin/gh/mikaylagawarecki/234/head
2025-12-04T11:11:09.6282617Z  * [new branch]                gh/mikaylagawarecki/235/base -> origin/gh/mikaylagawarecki/235/base
2025-12-04T11:11:09.6282714Z  * [new branch]                gh/mikaylagawarecki/235/head -> origin/gh/mikaylagawarecki/235/head
2025-12-04T11:11:09.6282812Z  * [new branch]                gh/mikaylagawarecki/236/base -> origin/gh/mikaylagawarecki/236/base
2025-12-04T11:11:09.6282908Z  * [new branch]                gh/mikaylagawarecki/236/head -> origin/gh/mikaylagawarecki/236/head
2025-12-04T11:11:09.6283007Z  * [new branch]                gh/mikaylagawarecki/237/base -> origin/gh/mikaylagawarecki/237/base
2025-12-04T11:11:09.6283103Z  * [new branch]                gh/mikaylagawarecki/237/head -> origin/gh/mikaylagawarecki/237/head
2025-12-04T11:11:09.6283198Z  * [new branch]                gh/mikaylagawarecki/238/base -> origin/gh/mikaylagawarecki/238/base
2025-12-04T11:11:09.6283298Z  * [new branch]                gh/mikaylagawarecki/238/head -> origin/gh/mikaylagawarecki/238/head
2025-12-04T11:11:09.6283396Z  * [new branch]                gh/mikaylagawarecki/336/base -> origin/gh/mikaylagawarecki/336/base
2025-12-04T11:11:09.6283491Z  * [new branch]                gh/mikaylagawarecki/336/head -> origin/gh/mikaylagawarecki/336/head
2025-12-04T11:11:09.6283594Z  * [new branch]                gh/mikaylagawarecki/336/orig -> origin/gh/mikaylagawarecki/336/orig
2025-12-04T11:11:09.6283689Z  * [new branch]                gh/mikaylagawarecki/341/base -> origin/gh/mikaylagawarecki/341/base
2025-12-04T11:11:09.6283785Z  * [new branch]                gh/mikaylagawarecki/341/head -> origin/gh/mikaylagawarecki/341/head
2025-12-04T11:11:09.6283886Z  * [new branch]                gh/mikaylagawarecki/341/orig -> origin/gh/mikaylagawarecki/341/orig
2025-12-04T11:11:09.6283982Z  * [new branch]                gh/mikaylagawarecki/342/base -> origin/gh/mikaylagawarecki/342/base
2025-12-04T11:11:09.6284078Z  * [new branch]                gh/mikaylagawarecki/342/head -> origin/gh/mikaylagawarecki/342/head
2025-12-04T11:11:09.6284199Z  * [new branch]                gh/mikaylagawarecki/342/orig -> origin/gh/mikaylagawarecki/342/orig
2025-12-04T11:11:09.6284296Z  * [new branch]                gh/mikaylagawarecki/345/base -> origin/gh/mikaylagawarecki/345/base
2025-12-04T11:11:09.6284419Z  * [new branch]                gh/mikaylagawarecki/345/head -> origin/gh/mikaylagawarecki/345/head
2025-12-04T11:11:09.6284517Z  * [new branch]                gh/mikaylagawarecki/345/orig -> origin/gh/mikaylagawarecki/345/orig
2025-12-04T11:11:09.6284613Z  * [new branch]                gh/mikaylagawarecki/346/base -> origin/gh/mikaylagawarecki/346/base
2025-12-04T11:11:09.6284715Z  * [new branch]                gh/mikaylagawarecki/346/head -> origin/gh/mikaylagawarecki/346/head
2025-12-04T11:11:09.6284811Z  * [new branch]                gh/mikaylagawarecki/346/orig -> origin/gh/mikaylagawarecki/346/orig
2025-12-04T11:11:09.6284907Z  * [new branch]                gh/mikaylagawarecki/347/base -> origin/gh/mikaylagawarecki/347/base
2025-12-04T11:11:09.6285009Z  * [new branch]                gh/mikaylagawarecki/347/head -> origin/gh/mikaylagawarecki/347/head
2025-12-04T11:11:09.6285106Z  * [new branch]                gh/mikaylagawarecki/347/orig -> origin/gh/mikaylagawarecki/347/orig
2025-12-04T11:11:09.6285203Z  * [new branch]                gh/mikaylagawarecki/350/base -> origin/gh/mikaylagawarecki/350/base
2025-12-04T11:11:09.6285302Z  * [new branch]                gh/mikaylagawarecki/350/head -> origin/gh/mikaylagawarecki/350/head
2025-12-04T11:11:09.6285398Z  * [new branch]                gh/mikaylagawarecki/350/orig -> origin/gh/mikaylagawarecki/350/orig
2025-12-04T11:11:09.6285494Z  * [new branch]                gh/mikaylagawarecki/351/base -> origin/gh/mikaylagawarecki/351/base
2025-12-04T11:11:09.6285598Z  * [new branch]                gh/mikaylagawarecki/351/head -> origin/gh/mikaylagawarecki/351/head
2025-12-04T11:11:09.6285694Z  * [new branch]                gh/mikaylagawarecki/351/orig -> origin/gh/mikaylagawarecki/351/orig
2025-12-04T11:11:09.6285792Z  * [new branch]                gh/mikaylagawarecki/352/base -> origin/gh/mikaylagawarecki/352/base
2025-12-04T11:11:09.6285892Z  * [new branch]                gh/mikaylagawarecki/352/head -> origin/gh/mikaylagawarecki/352/head
2025-12-04T11:11:09.6285986Z  * [new branch]                gh/mikaylagawarecki/352/orig -> origin/gh/mikaylagawarecki/352/orig
2025-12-04T11:11:09.6286085Z  * [new branch]                gh/mikaylagawarecki/353/base -> origin/gh/mikaylagawarecki/353/base
2025-12-04T11:11:09.6286182Z  * [new branch]                gh/mikaylagawarecki/353/head -> origin/gh/mikaylagawarecki/353/head
2025-12-04T11:11:09.6286275Z  * [new branch]                gh/mikaylagawarecki/353/orig -> origin/gh/mikaylagawarecki/353/orig
2025-12-04T11:11:09.6286373Z  * [new branch]                gh/mikaylagawarecki/354/base -> origin/gh/mikaylagawarecki/354/base
2025-12-04T11:11:09.6286469Z  * [new branch]                gh/mikaylagawarecki/354/head -> origin/gh/mikaylagawarecki/354/head
2025-12-04T11:11:09.6286566Z  * [new branch]                gh/mikaylagawarecki/354/orig -> origin/gh/mikaylagawarecki/354/orig
2025-12-04T11:11:09.6286667Z  * [new branch]                gh/mikaylagawarecki/356/base -> origin/gh/mikaylagawarecki/356/base
2025-12-04T11:11:09.6286765Z  * [new branch]                gh/mikaylagawarecki/356/head -> origin/gh/mikaylagawarecki/356/head
2025-12-04T11:11:09.6286858Z  * [new branch]                gh/mikaylagawarecki/356/orig -> origin/gh/mikaylagawarecki/356/orig
2025-12-04T11:11:09.6286960Z  * [new branch]                gh/mikaylagawarecki/357/base -> origin/gh/mikaylagawarecki/357/base
2025-12-04T11:11:09.6287057Z  * [new branch]                gh/mikaylagawarecki/357/head -> origin/gh/mikaylagawarecki/357/head
2025-12-04T11:11:09.6287154Z  * [new branch]                gh/mikaylagawarecki/357/orig -> origin/gh/mikaylagawarecki/357/orig
2025-12-04T11:11:09.6287254Z  * [new branch]                gh/mikaylagawarecki/359/base -> origin/gh/mikaylagawarecki/359/base
2025-12-04T11:11:09.6287370Z  * [new branch]                gh/mikaylagawarecki/359/head -> origin/gh/mikaylagawarecki/359/head
2025-12-04T11:11:09.6287466Z  * [new branch]                gh/mikaylagawarecki/359/orig -> origin/gh/mikaylagawarecki/359/orig
2025-12-04T11:11:09.6287584Z  * [new branch]                gh/mikaylagawarecki/360/base -> origin/gh/mikaylagawarecki/360/base
2025-12-04T11:11:09.6287680Z  * [new branch]                gh/mikaylagawarecki/360/head -> origin/gh/mikaylagawarecki/360/head
2025-12-04T11:11:09.6287781Z  * [new branch]                gh/mikaylagawarecki/360/orig -> origin/gh/mikaylagawarecki/360/orig
2025-12-04T11:11:09.6287876Z  * [new branch]                gh/mikaylagawarecki/361/base -> origin/gh/mikaylagawarecki/361/base
2025-12-04T11:11:09.6287971Z  * [new branch]                gh/mikaylagawarecki/361/head -> origin/gh/mikaylagawarecki/361/head
2025-12-04T11:11:09.6288071Z  * [new branch]                gh/mikaylagawarecki/361/orig -> origin/gh/mikaylagawarecki/361/orig
2025-12-04T11:11:09.6288200Z  * [new branch]                gh/mikaylagawarecki/362/base -> origin/gh/mikaylagawarecki/362/base
2025-12-04T11:11:09.6288298Z  * [new branch]                gh/mikaylagawarecki/362/head -> origin/gh/mikaylagawarecki/362/head
2025-12-04T11:11:09.6288399Z  * [new branch]                gh/mikaylagawarecki/362/orig -> origin/gh/mikaylagawarecki/362/orig
2025-12-04T11:11:09.6288494Z  * [new branch]                gh/mikaylagawarecki/363/base -> origin/gh/mikaylagawarecki/363/base
2025-12-04T11:11:09.6288592Z  * [new branch]                gh/mikaylagawarecki/363/head -> origin/gh/mikaylagawarecki/363/head
2025-12-04T11:11:09.6288690Z  * [new branch]                gh/mikaylagawarecki/363/orig -> origin/gh/mikaylagawarecki/363/orig
2025-12-04T11:11:09.6288787Z  * [new branch]                gh/mikaylagawarecki/364/base -> origin/gh/mikaylagawarecki/364/base
2025-12-04T11:11:09.6288883Z  * [new branch]                gh/mikaylagawarecki/364/head -> origin/gh/mikaylagawarecki/364/head
2025-12-04T11:11:09.6288982Z  * [new branch]                gh/mikaylagawarecki/364/orig -> origin/gh/mikaylagawarecki/364/orig
2025-12-04T11:11:09.6289078Z  * [new branch]                gh/mikaylagawarecki/365/base -> origin/gh/mikaylagawarecki/365/base
2025-12-04T11:11:09.6289181Z  * [new branch]                gh/mikaylagawarecki/365/head -> origin/gh/mikaylagawarecki/365/head
2025-12-04T11:11:09.6289278Z  * [new branch]                gh/mikaylagawarecki/365/orig -> origin/gh/mikaylagawarecki/365/orig
2025-12-04T11:11:09.6289373Z  * [new branch]                gh/mikaylagawarecki/366/base -> origin/gh/mikaylagawarecki/366/base
2025-12-04T11:11:09.6289476Z  * [new branch]                gh/mikaylagawarecki/366/head -> origin/gh/mikaylagawarecki/366/head
2025-12-04T11:11:09.6289573Z  * [new branch]                gh/mikaylagawarecki/366/orig -> origin/gh/mikaylagawarecki/366/orig
2025-12-04T11:11:09.6289669Z  * [new branch]                gh/mikaylagawarecki/367/base -> origin/gh/mikaylagawarecki/367/base
2025-12-04T11:11:09.6289774Z  * [new branch]                gh/mikaylagawarecki/367/head -> origin/gh/mikaylagawarecki/367/head
2025-12-04T11:11:09.6289870Z  * [new branch]                gh/mikaylagawarecki/367/orig -> origin/gh/mikaylagawarecki/367/orig
2025-12-04T11:11:09.6289966Z  * [new branch]                gh/mikaylagawarecki/368/base -> origin/gh/mikaylagawarecki/368/base
2025-12-04T11:11:09.6290068Z  * [new branch]                gh/mikaylagawarecki/368/head -> origin/gh/mikaylagawarecki/368/head
2025-12-04T11:11:09.6290164Z  * [new branch]                gh/mikaylagawarecki/368/orig -> origin/gh/mikaylagawarecki/368/orig
2025-12-04T11:11:09.6290259Z  * [new branch]                gh/mikaylagawarecki/369/base -> origin/gh/mikaylagawarecki/369/base
2025-12-04T11:11:09.6290362Z  * [new branch]                gh/mikaylagawarecki/369/head -> origin/gh/mikaylagawarecki/369/head
2025-12-04T11:11:09.6290460Z  * [new branch]                gh/mikaylagawarecki/369/orig -> origin/gh/mikaylagawarecki/369/orig
2025-12-04T11:11:09.6290598Z  * [new branch]                gh/mikaylagawarecki/370/base -> origin/gh/mikaylagawarecki/370/base
2025-12-04T11:11:09.6290696Z  * [new branch]                gh/mikaylagawarecki/370/head -> origin/gh/mikaylagawarecki/370/head
2025-12-04T11:11:09.6290819Z  * [new branch]                gh/mikaylagawarecki/370/orig -> origin/gh/mikaylagawarecki/370/orig
2025-12-04T11:11:09.6290920Z  * [new branch]                gh/mikaylagawarecki/371/base -> origin/gh/mikaylagawarecki/371/base
2025-12-04T11:11:09.6291016Z  * [new branch]                gh/mikaylagawarecki/371/head -> origin/gh/mikaylagawarecki/371/head
2025-12-04T11:11:09.6291111Z  * [new branch]                gh/mikaylagawarecki/371/orig -> origin/gh/mikaylagawarecki/371/orig
2025-12-04T11:11:09.6291213Z  * [new branch]                gh/mikaylagawarecki/372/base -> origin/gh/mikaylagawarecki/372/base
2025-12-04T11:11:09.6291309Z  * [new branch]                gh/mikaylagawarecki/372/head -> origin/gh/mikaylagawarecki/372/head
2025-12-04T11:11:09.6291406Z  * [new branch]                gh/mikaylagawarecki/372/orig -> origin/gh/mikaylagawarecki/372/orig
2025-12-04T11:11:09.6291507Z  * [new branch]                gh/mikaylagawarecki/373/base -> origin/gh/mikaylagawarecki/373/base
2025-12-04T11:11:09.6291604Z  * [new branch]                gh/mikaylagawarecki/373/head -> origin/gh/mikaylagawarecki/373/head
2025-12-04T11:11:09.6291700Z  * [new branch]                gh/mikaylagawarecki/373/orig -> origin/gh/mikaylagawarecki/373/orig
2025-12-04T11:11:09.6291800Z  * [new branch]                gh/mikaylagawarecki/374/base -> origin/gh/mikaylagawarecki/374/base
2025-12-04T11:11:09.6291895Z  * [new branch]                gh/mikaylagawarecki/374/head -> origin/gh/mikaylagawarecki/374/head
2025-12-04T11:11:09.6291990Z  * [new branch]                gh/mikaylagawarecki/374/orig -> origin/gh/mikaylagawarecki/374/orig
2025-12-04T11:11:09.6292088Z  * [new branch]                gh/mikaylagawarecki/375/base -> origin/gh/mikaylagawarecki/375/base
2025-12-04T11:11:09.6292188Z  * [new branch]                gh/mikaylagawarecki/375/head -> origin/gh/mikaylagawarecki/375/head
2025-12-04T11:11:09.6292285Z  * [new branch]                gh/mikaylagawarecki/375/orig -> origin/gh/mikaylagawarecki/375/orig
2025-12-04T11:11:09.6292380Z  * [new branch]                gh/mikaylagawarecki/376/base -> origin/gh/mikaylagawarecki/376/base
2025-12-04T11:11:09.6292474Z  * [new branch]                gh/mikaylagawarecki/376/head -> origin/gh/mikaylagawarecki/376/head
2025-12-04T11:11:09.6292574Z  * [new branch]                gh/mikaylagawarecki/376/orig -> origin/gh/mikaylagawarecki/376/orig
2025-12-04T11:11:09.6292670Z  * [new branch]                gh/mikaylagawarecki/377/base -> origin/gh/mikaylagawarecki/377/base
2025-12-04T11:11:09.6292766Z  * [new branch]                gh/mikaylagawarecki/377/head -> origin/gh/mikaylagawarecki/377/head
2025-12-04T11:11:09.6292870Z  * [new branch]                gh/mikaylagawarecki/377/orig -> origin/gh/mikaylagawarecki/377/orig
2025-12-04T11:11:09.6292968Z  * [new branch]                gh/mikaylagawarecki/378/base -> origin/gh/mikaylagawarecki/378/base
2025-12-04T11:11:09.6293064Z  * [new branch]                gh/mikaylagawarecki/378/head -> origin/gh/mikaylagawarecki/378/head
2025-12-04T11:11:09.6293167Z  * [new branch]                gh/mikaylagawarecki/378/orig -> origin/gh/mikaylagawarecki/378/orig
2025-12-04T11:11:09.6293263Z  * [new branch]                gh/mikaylagawarecki/379/base -> origin/gh/mikaylagawarecki/379/base
2025-12-04T11:11:09.6293360Z  * [new branch]                gh/mikaylagawarecki/379/head -> origin/gh/mikaylagawarecki/379/head
2025-12-04T11:11:09.6293462Z  * [new branch]                gh/mikaylagawarecki/379/orig -> origin/gh/mikaylagawarecki/379/orig
2025-12-04T11:11:09.6293560Z  * [new branch]                gh/mikaylagawarecki/380/base -> origin/gh/mikaylagawarecki/380/base
2025-12-04T11:11:09.6293662Z  * [new branch]                gh/mikaylagawarecki/380/head -> origin/gh/mikaylagawarecki/380/head
2025-12-04T11:11:09.6293778Z  * [new branch]                gh/mikaylagawarecki/380/orig -> origin/gh/mikaylagawarecki/380/orig
2025-12-04T11:11:09.6293873Z  * [new branch]                gh/mikaylagawarecki/381/base -> origin/gh/mikaylagawarecki/381/base
2025-12-04T11:11:09.6293995Z  * [new branch]                gh/mikaylagawarecki/381/head -> origin/gh/mikaylagawarecki/381/head
2025-12-04T11:11:09.6294093Z  * [new branch]                gh/mikaylagawarecki/381/orig -> origin/gh/mikaylagawarecki/381/orig
2025-12-04T11:11:09.6294190Z  * [new branch]                gh/mikaylagawarecki/382/base -> origin/gh/mikaylagawarecki/382/base
2025-12-04T11:11:09.6294291Z  * [new branch]                gh/mikaylagawarecki/382/head -> origin/gh/mikaylagawarecki/382/head
2025-12-04T11:11:09.6294388Z  * [new branch]                gh/mikaylagawarecki/382/orig -> origin/gh/mikaylagawarecki/382/orig
2025-12-04T11:11:09.6294484Z  * [new branch]                gh/mikaylagawarecki/383/base -> origin/gh/mikaylagawarecki/383/base
2025-12-04T11:11:09.6294586Z  * [new branch]                gh/mikaylagawarecki/383/head -> origin/gh/mikaylagawarecki/383/head
2025-12-04T11:11:09.6294680Z  * [new branch]                gh/mikaylagawarecki/383/orig -> origin/gh/mikaylagawarecki/383/orig
2025-12-04T11:11:09.6294777Z  * [new branch]                gh/mikaylagawarecki/384/base -> origin/gh/mikaylagawarecki/384/base
2025-12-04T11:11:09.6294877Z  * [new branch]                gh/mikaylagawarecki/384/head -> origin/gh/mikaylagawarecki/384/head
2025-12-04T11:11:09.6294972Z  * [new branch]                gh/mikaylagawarecki/384/orig -> origin/gh/mikaylagawarecki/384/orig
2025-12-04T11:11:09.6295068Z  * [new branch]                gh/mikaylagawarecki/385/base -> origin/gh/mikaylagawarecki/385/base
2025-12-04T11:11:09.6295172Z  * [new branch]                gh/mikaylagawarecki/385/head -> origin/gh/mikaylagawarecki/385/head
2025-12-04T11:11:09.6295267Z  * [new branch]                gh/mikaylagawarecki/385/orig -> origin/gh/mikaylagawarecki/385/orig
2025-12-04T11:11:09.6295369Z  * [new branch]                gh/mikaylagawarecki/386/base -> origin/gh/mikaylagawarecki/386/base
2025-12-04T11:11:09.6295464Z  * [new branch]                gh/mikaylagawarecki/386/head -> origin/gh/mikaylagawarecki/386/head
2025-12-04T11:11:09.6295558Z  * [new branch]                gh/mikaylagawarecki/386/orig -> origin/gh/mikaylagawarecki/386/orig
2025-12-04T11:11:09.6295653Z  * [new branch]                gh/mikaylagawarecki/387/base -> origin/gh/mikaylagawarecki/387/base
2025-12-04T11:11:09.6295753Z  * [new branch]                gh/mikaylagawarecki/387/head -> origin/gh/mikaylagawarecki/387/head
2025-12-04T11:11:09.6295849Z  * [new branch]                gh/mikaylagawarecki/387/orig -> origin/gh/mikaylagawarecki/387/orig
2025-12-04T11:11:09.6295943Z  * [new branch]                gh/mikaylagawarecki/388/base -> origin/gh/mikaylagawarecki/388/base
2025-12-04T11:11:09.6296041Z  * [new branch]                gh/mikaylagawarecki/388/head -> origin/gh/mikaylagawarecki/388/head
2025-12-04T11:11:09.6296137Z  * [new branch]                gh/mikaylagawarecki/388/orig -> origin/gh/mikaylagawarecki/388/orig
2025-12-04T11:11:09.6296233Z  * [new branch]                gh/mikaylagawarecki/389/base -> origin/gh/mikaylagawarecki/389/base
2025-12-04T11:11:09.6296331Z  * [new branch]                gh/mikaylagawarecki/389/head -> origin/gh/mikaylagawarecki/389/head
2025-12-04T11:11:09.6296427Z  * [new branch]                gh/mikaylagawarecki/389/orig -> origin/gh/mikaylagawarecki/389/orig
2025-12-04T11:11:09.6296522Z  * [new branch]                gh/mikaylagawarecki/390/base -> origin/gh/mikaylagawarecki/390/base
2025-12-04T11:11:09.6296621Z  * [new branch]                gh/mikaylagawarecki/390/head -> origin/gh/mikaylagawarecki/390/head
2025-12-04T11:11:09.6296715Z  * [new branch]                gh/mikaylagawarecki/390/orig -> origin/gh/mikaylagawarecki/390/orig
2025-12-04T11:11:09.6296811Z  * [new branch]                gh/mikaylagawarecki/391/base -> origin/gh/mikaylagawarecki/391/base
2025-12-04T11:11:09.6296928Z  * [new branch]                gh/mikaylagawarecki/391/head -> origin/gh/mikaylagawarecki/391/head
2025-12-04T11:11:09.6297023Z  * [new branch]                gh/mikaylagawarecki/391/orig -> origin/gh/mikaylagawarecki/391/orig
2025-12-04T11:11:09.6297151Z  * [new branch]                gh/mikaylagawarecki/392/base -> origin/gh/mikaylagawarecki/392/base
2025-12-04T11:11:09.6297247Z  * [new branch]                gh/mikaylagawarecki/392/head -> origin/gh/mikaylagawarecki/392/head
2025-12-04T11:11:09.6297341Z  * [new branch]                gh/mikaylagawarecki/392/orig -> origin/gh/mikaylagawarecki/392/orig
2025-12-04T11:11:09.6297419Z  * [new branch]                gh/mlazos/41/base       -> origin/gh/mlazos/41/base
2025-12-04T11:11:09.6297491Z  * [new branch]                gh/mlazos/41/head       -> origin/gh/mlazos/41/head
2025-12-04T11:11:09.6297562Z  * [new branch]                gh/mlazos/41/orig       -> origin/gh/mlazos/41/orig
2025-12-04T11:11:09.6297638Z  * [new branch]                gh/mlazos/42/base       -> origin/gh/mlazos/42/base
2025-12-04T11:11:09.6297709Z  * [new branch]                gh/mlazos/42/head       -> origin/gh/mlazos/42/head
2025-12-04T11:11:09.6297779Z  * [new branch]                gh/mlazos/42/orig       -> origin/gh/mlazos/42/orig
2025-12-04T11:11:09.6297853Z  * [new branch]                gh/mlazos/43/base       -> origin/gh/mlazos/43/base
2025-12-04T11:11:09.6297922Z  * [new branch]                gh/mlazos/43/head       -> origin/gh/mlazos/43/head
2025-12-04T11:11:09.6297991Z  * [new branch]                gh/mlazos/43/orig       -> origin/gh/mlazos/43/orig
2025-12-04T11:11:09.6298065Z  * [new branch]                gh/mlazos/44/base       -> origin/gh/mlazos/44/base
2025-12-04T11:11:09.6298133Z  * [new branch]                gh/mlazos/44/head       -> origin/gh/mlazos/44/head
2025-12-04T11:11:09.6298232Z  * [new branch]                gh/mlazos/44/orig       -> origin/gh/mlazos/44/orig
2025-12-04T11:11:09.6298309Z  * [new branch]                gh/mlazos/47/base       -> origin/gh/mlazos/47/base
2025-12-04T11:11:09.6298380Z  * [new branch]                gh/mlazos/47/head       -> origin/gh/mlazos/47/head
2025-12-04T11:11:09.6298451Z  * [new branch]                gh/mlazos/47/orig       -> origin/gh/mlazos/47/orig
2025-12-04T11:11:09.6298528Z  * [new branch]                gh/mlazos/48/base       -> origin/gh/mlazos/48/base
2025-12-04T11:11:09.6298599Z  * [new branch]                gh/mlazos/48/head       -> origin/gh/mlazos/48/head
2025-12-04T11:11:09.6298672Z  * [new branch]                gh/mlazos/48/orig       -> origin/gh/mlazos/48/orig
2025-12-04T11:11:09.6298742Z  * [new branch]                gh/mlazos/49/base       -> origin/gh/mlazos/49/base
2025-12-04T11:11:09.6298812Z  * [new branch]                gh/mlazos/49/head       -> origin/gh/mlazos/49/head
2025-12-04T11:11:09.6298884Z  * [new branch]                gh/mlazos/49/orig       -> origin/gh/mlazos/49/orig
2025-12-04T11:11:09.6298956Z  * [new branch]                gh/mlazos/50/base       -> origin/gh/mlazos/50/base
2025-12-04T11:11:09.6299026Z  * [new branch]                gh/mlazos/50/head       -> origin/gh/mlazos/50/head
2025-12-04T11:11:09.6299100Z  * [new branch]                gh/mlazos/50/orig       -> origin/gh/mlazos/50/orig
2025-12-04T11:11:09.6299172Z  * [new branch]                gh/mlazos/51/base       -> origin/gh/mlazos/51/base
2025-12-04T11:11:09.6299243Z  * [new branch]                gh/mlazos/51/head       -> origin/gh/mlazos/51/head
2025-12-04T11:11:09.6299317Z  * [new branch]                gh/mlazos/51/orig       -> origin/gh/mlazos/51/orig
2025-12-04T11:11:09.6299386Z  * [new branch]                gh/mlazos/52/base       -> origin/gh/mlazos/52/base
2025-12-04T11:11:09.6299455Z  * [new branch]                gh/mlazos/52/head       -> origin/gh/mlazos/52/head
2025-12-04T11:11:09.6299526Z  * [new branch]                gh/mlazos/52/orig       -> origin/gh/mlazos/52/orig
2025-12-04T11:11:09.6299628Z  * [new branch]                gh/mlazos/53/base       -> origin/gh/mlazos/53/base
2025-12-04T11:11:09.6299699Z  * [new branch]                gh/mlazos/53/head       -> origin/gh/mlazos/53/head
2025-12-04T11:11:09.6299773Z  * [new branch]                gh/mlazos/53/orig       -> origin/gh/mlazos/53/orig
2025-12-04T11:11:09.6299872Z  * [new branch]                gh/mlazos/54/base       -> origin/gh/mlazos/54/base
2025-12-04T11:11:09.6299943Z  * [new branch]                gh/mlazos/54/head       -> origin/gh/mlazos/54/head
2025-12-04T11:11:09.6300018Z  * [new branch]                gh/mlazos/54/orig       -> origin/gh/mlazos/54/orig
2025-12-04T11:11:09.6300087Z  * [new branch]                gh/mlazos/55/base       -> origin/gh/mlazos/55/base
2025-12-04T11:11:09.6300158Z  * [new branch]                gh/mlazos/55/head       -> origin/gh/mlazos/55/head
2025-12-04T11:11:09.6300231Z  * [new branch]                gh/mlazos/55/orig       -> origin/gh/mlazos/55/orig
2025-12-04T11:11:09.6300299Z  * [new branch]                gh/mlazos/56/base       -> origin/gh/mlazos/56/base
2025-12-04T11:11:09.6300376Z  * [new branch]                gh/mlazos/56/head       -> origin/gh/mlazos/56/head
2025-12-04T11:11:09.6300445Z  * [new branch]                gh/mlazos/56/orig       -> origin/gh/mlazos/56/orig
2025-12-04T11:11:09.6300555Z  * [new branch]                gh/mlazos/57/base       -> origin/gh/mlazos/57/base
2025-12-04T11:11:09.6300666Z  * [new branch]                gh/mlazos/57/head       -> origin/gh/mlazos/57/head
2025-12-04T11:11:09.6300739Z  * [new branch]                gh/mlazos/57/orig       -> origin/gh/mlazos/57/orig
2025-12-04T11:11:09.6300809Z  * [new branch]                gh/mlazos/58/base       -> origin/gh/mlazos/58/base
2025-12-04T11:11:09.6300884Z  * [new branch]                gh/mlazos/58/head       -> origin/gh/mlazos/58/head
2025-12-04T11:11:09.6300954Z  * [new branch]                gh/mlazos/58/orig       -> origin/gh/mlazos/58/orig
2025-12-04T11:11:09.6301024Z  * [new branch]                gh/mlazos/59/base       -> origin/gh/mlazos/59/base
2025-12-04T11:11:09.6301102Z  * [new branch]                gh/mlazos/59/head       -> origin/gh/mlazos/59/head
2025-12-04T11:11:09.6301171Z  * [new branch]                gh/mlazos/59/orig       -> origin/gh/mlazos/59/orig
2025-12-04T11:11:09.6301244Z  * [new branch]                gh/mlazos/60/base       -> origin/gh/mlazos/60/base
2025-12-04T11:11:09.6301316Z  * [new branch]                gh/mlazos/60/head       -> origin/gh/mlazos/60/head
2025-12-04T11:11:09.6301386Z  * [new branch]                gh/mlazos/60/orig       -> origin/gh/mlazos/60/orig
2025-12-04T11:11:09.6301456Z  * [new branch]                gh/mlazos/61/base       -> origin/gh/mlazos/61/base
2025-12-04T11:11:09.6301532Z  * [new branch]                gh/mlazos/61/head       -> origin/gh/mlazos/61/head
2025-12-04T11:11:09.6301602Z  * [new branch]                gh/mlazos/61/orig       -> origin/gh/mlazos/61/orig
2025-12-04T11:11:09.6301672Z  * [new branch]                gh/mlazos/62/base       -> origin/gh/mlazos/62/base
2025-12-04T11:11:09.6301750Z  * [new branch]                gh/mlazos/62/head       -> origin/gh/mlazos/62/head
2025-12-04T11:11:09.6301820Z  * [new branch]                gh/mlazos/62/orig       -> origin/gh/mlazos/62/orig
2025-12-04T11:11:09.6301891Z  * [new branch]                gh/mlazos/63/base       -> origin/gh/mlazos/63/base
2025-12-04T11:11:09.6301965Z  * [new branch]                gh/mlazos/63/head       -> origin/gh/mlazos/63/head
2025-12-04T11:11:09.6302034Z  * [new branch]                gh/mlazos/63/orig       -> origin/gh/mlazos/63/orig
2025-12-04T11:11:09.6302107Z  * [new branch]                gh/mlazos/64/base       -> origin/gh/mlazos/64/base
2025-12-04T11:11:09.6302176Z  * [new branch]                gh/mlazos/64/head       -> origin/gh/mlazos/64/head
2025-12-04T11:11:09.6302246Z  * [new branch]                gh/mlazos/64/orig       -> origin/gh/mlazos/64/orig
2025-12-04T11:11:09.6302321Z  * [new branch]                gh/mlazos/65/base       -> origin/gh/mlazos/65/base
2025-12-04T11:11:09.6302416Z  * [new branch]                gh/mlazos/65/head       -> origin/gh/mlazos/65/head
2025-12-04T11:11:09.6302487Z  * [new branch]                gh/mlazos/65/orig       -> origin/gh/mlazos/65/orig
2025-12-04T11:11:09.6302561Z  * [new branch]                gh/mlazos/66/base       -> origin/gh/mlazos/66/base
2025-12-04T11:11:09.6302653Z  * [new branch]                gh/mlazos/66/head       -> origin/gh/mlazos/66/head
2025-12-04T11:11:09.6302722Z  * [new branch]                gh/mlazos/66/orig       -> origin/gh/mlazos/66/orig
2025-12-04T11:11:09.6302795Z  * [new branch]                gh/mlazos/67/base       -> origin/gh/mlazos/67/base
2025-12-04T11:11:09.6302865Z  * [new branch]                gh/mlazos/67/head       -> origin/gh/mlazos/67/head
2025-12-04T11:11:09.6302934Z  * [new branch]                gh/mlazos/67/orig       -> origin/gh/mlazos/67/orig
2025-12-04T11:11:09.6303008Z  * [new branch]                gh/mlazos/68/base       -> origin/gh/mlazos/68/base
2025-12-04T11:11:09.6303077Z  * [new branch]                gh/mlazos/68/head       -> origin/gh/mlazos/68/head
2025-12-04T11:11:09.6303146Z  * [new branch]                gh/mlazos/68/orig       -> origin/gh/mlazos/68/orig
2025-12-04T11:11:09.6303221Z  * [new branch]                gh/mlazos/69/base       -> origin/gh/mlazos/69/base
2025-12-04T11:11:09.6303291Z  * [new branch]                gh/mlazos/69/head       -> origin/gh/mlazos/69/head
2025-12-04T11:11:09.6303361Z  * [new branch]                gh/mlazos/69/orig       -> origin/gh/mlazos/69/orig
2025-12-04T11:11:09.6303435Z  * [new branch]                gh/mlazos/70/base       -> origin/gh/mlazos/70/base
2025-12-04T11:11:09.6303505Z  * [new branch]                gh/mlazos/70/head       -> origin/gh/mlazos/70/head
2025-12-04T11:11:09.6303573Z  * [new branch]                gh/mlazos/70/orig       -> origin/gh/mlazos/70/orig
2025-12-04T11:11:09.6303646Z  * [new branch]                gh/mlazos/71/base       -> origin/gh/mlazos/71/base
2025-12-04T11:11:09.6303716Z  * [new branch]                gh/mlazos/71/head       -> origin/gh/mlazos/71/head
2025-12-04T11:11:09.6303786Z  * [new branch]                gh/mlazos/71/orig       -> origin/gh/mlazos/71/orig
2025-12-04T11:11:09.6303860Z  * [new branch]                gh/mlazos/72/base       -> origin/gh/mlazos/72/base
2025-12-04T11:11:09.6303930Z  * [new branch]                gh/mlazos/72/head       -> origin/gh/mlazos/72/head
2025-12-04T11:11:09.6304002Z  * [new branch]                gh/mlazos/72/orig       -> origin/gh/mlazos/72/orig
2025-12-04T11:11:09.6304071Z  * [new branch]                gh/mlazos/73/base       -> origin/gh/mlazos/73/base
2025-12-04T11:11:09.6304141Z  * [new branch]                gh/mlazos/73/head       -> origin/gh/mlazos/73/head
2025-12-04T11:11:09.6304212Z  * [new branch]                gh/mlazos/73/orig       -> origin/gh/mlazos/73/orig
2025-12-04T11:11:09.6304284Z  * [new branch]                gh/mrmiywj/1/base       -> origin/gh/mrmiywj/1/base
2025-12-04T11:11:09.6304356Z  * [new branch]                gh/mrmiywj/1/head       -> origin/gh/mrmiywj/1/head
2025-12-04T11:11:09.6304437Z  * [new branch]                gh/muchulee8/73/base    -> origin/gh/muchulee8/73/base
2025-12-04T11:11:09.6304514Z  * [new branch]                gh/muchulee8/73/head    -> origin/gh/muchulee8/73/head
2025-12-04T11:11:09.6304591Z  * [new branch]                gh/muchulee8/73/orig    -> origin/gh/muchulee8/73/orig
2025-12-04T11:11:09.6304684Z  * [new branch]                gh/naveenthangudu/1/base -> origin/gh/naveenthangudu/1/base
2025-12-04T11:11:09.6304773Z  * [new branch]                gh/naveenthangudu/1/head -> origin/gh/naveenthangudu/1/head
2025-12-04T11:11:09.6304857Z  * [new branch]                gh/naveenthangudu/1/orig -> origin/gh/naveenthangudu/1/orig
2025-12-04T11:11:09.6304945Z  * [new branch]                gh/naveenthangudu/2/base -> origin/gh/naveenthangudu/2/base
2025-12-04T11:11:09.6305027Z  * [new branch]                gh/naveenthangudu/2/head -> origin/gh/naveenthangudu/2/head
2025-12-04T11:11:09.6305130Z  * [new branch]                gh/naveenthangudu/2/orig -> origin/gh/naveenthangudu/2/orig
2025-12-04T11:11:09.6305218Z  * [new branch]                gh/naveenthangudu/3/base -> origin/gh/naveenthangudu/3/base
2025-12-04T11:11:09.6305301Z  * [new branch]                gh/naveenthangudu/3/head -> origin/gh/naveenthangudu/3/head
2025-12-04T11:11:09.6305413Z  * [new branch]                gh/naveenthangudu/3/orig -> origin/gh/naveenthangudu/3/orig
2025-12-04T11:11:09.6305500Z  * [new branch]                gh/naveenthangudu/4/base -> origin/gh/naveenthangudu/4/base
2025-12-04T11:11:09.6305585Z  * [new branch]                gh/naveenthangudu/4/head -> origin/gh/naveenthangudu/4/head
2025-12-04T11:11:09.6305675Z  * [new branch]                gh/naveenthangudu/4/orig -> origin/gh/naveenthangudu/4/orig
2025-12-04T11:11:09.6305759Z  * [new branch]                gh/naveenthangudu/5/base -> origin/gh/naveenthangudu/5/base
2025-12-04T11:11:09.6305842Z  * [new branch]                gh/naveenthangudu/5/head -> origin/gh/naveenthangudu/5/head
2025-12-04T11:11:09.6305929Z  * [new branch]                gh/naveenthangudu/5/orig -> origin/gh/naveenthangudu/5/orig
2025-12-04T11:11:09.6306011Z  * [new branch]                gh/naveenthangudu/6/base -> origin/gh/naveenthangudu/6/base
2025-12-04T11:11:09.6306096Z  * [new branch]                gh/naveenthangudu/6/head -> origin/gh/naveenthangudu/6/head
2025-12-04T11:11:09.6306180Z  * [new branch]                gh/naveenthangudu/6/orig -> origin/gh/naveenthangudu/6/orig
2025-12-04T11:11:09.6306262Z  * [new branch]                gh/naveenthangudu/7/base -> origin/gh/naveenthangudu/7/base
2025-12-04T11:11:09.6306344Z  * [new branch]                gh/naveenthangudu/7/head -> origin/gh/naveenthangudu/7/head
2025-12-04T11:11:09.6306429Z  * [new branch]                gh/naveenthangudu/7/orig -> origin/gh/naveenthangudu/7/orig
2025-12-04T11:11:09.6306511Z  * [new branch]                gh/naveenthangudu/8/base -> origin/gh/naveenthangudu/8/base
2025-12-04T11:11:09.6306595Z  * [new branch]                gh/naveenthangudu/8/head -> origin/gh/naveenthangudu/8/head
2025-12-04T11:11:09.6306681Z  * [new branch]                gh/naveenthangudu/8/orig -> origin/gh/naveenthangudu/8/orig
2025-12-04T11:11:09.6306763Z  * [new branch]                gh/naveenthangudu/9/base -> origin/gh/naveenthangudu/9/base
2025-12-04T11:11:09.6306847Z  * [new branch]                gh/naveenthangudu/9/head -> origin/gh/naveenthangudu/9/head
2025-12-04T11:11:09.6306933Z  * [new branch]                gh/naveenthangudu/9/orig -> origin/gh/naveenthangudu/9/orig
2025-12-04T11:11:09.6307009Z  * [new branch]                gh/nikitaved/1/base     -> origin/gh/nikitaved/1/base
2025-12-04T11:11:09.6307086Z  * [new branch]                gh/nikitaved/1/head     -> origin/gh/nikitaved/1/head
2025-12-04T11:11:09.6307166Z  * [new branch]                gh/nikitaved/1/orig     -> origin/gh/nikitaved/1/orig
2025-12-04T11:11:09.6307245Z  * [new branch]                gh/nikitaved/10/base    -> origin/gh/nikitaved/10/base
2025-12-04T11:11:09.6307328Z  * [new branch]                gh/nikitaved/10/head    -> origin/gh/nikitaved/10/head
2025-12-04T11:11:09.6307405Z  * [new branch]                gh/nikitaved/10/orig    -> origin/gh/nikitaved/10/orig
2025-12-04T11:11:09.6307482Z  * [new branch]                gh/nikitaved/11/base    -> origin/gh/nikitaved/11/base
2025-12-04T11:11:09.6307561Z  * [new branch]                gh/nikitaved/11/head    -> origin/gh/nikitaved/11/head
2025-12-04T11:11:09.6307635Z  * [new branch]                gh/nikitaved/11/orig    -> origin/gh/nikitaved/11/orig
2025-12-04T11:11:09.6307709Z  * [new branch]                gh/nikitaved/12/base    -> origin/gh/nikitaved/12/base
2025-12-04T11:11:09.6307787Z  * [new branch]                gh/nikitaved/12/head    -> origin/gh/nikitaved/12/head
2025-12-04T11:11:09.6307861Z  * [new branch]                gh/nikitaved/12/orig    -> origin/gh/nikitaved/12/orig
2025-12-04T11:11:09.6307935Z  * [new branch]                gh/nikitaved/13/base    -> origin/gh/nikitaved/13/base
2025-12-04T11:11:09.6308042Z  * [new branch]                gh/nikitaved/13/head    -> origin/gh/nikitaved/13/head
2025-12-04T11:11:09.6308117Z  * [new branch]                gh/nikitaved/13/orig    -> origin/gh/nikitaved/13/orig
2025-12-04T11:11:09.6308262Z  * [new branch]                gh/nikitaved/14/base    -> origin/gh/nikitaved/14/base
2025-12-04T11:11:09.6308341Z  * [new branch]                gh/nikitaved/14/head    -> origin/gh/nikitaved/14/head
2025-12-04T11:11:09.6308415Z  * [new branch]                gh/nikitaved/14/orig    -> origin/gh/nikitaved/14/orig
2025-12-04T11:11:09.6308489Z  * [new branch]                gh/nikitaved/15/base    -> origin/gh/nikitaved/15/base
2025-12-04T11:11:09.6308567Z  * [new branch]                gh/nikitaved/15/head    -> origin/gh/nikitaved/15/head
2025-12-04T11:11:09.6308641Z  * [new branch]                gh/nikitaved/15/orig    -> origin/gh/nikitaved/15/orig
2025-12-04T11:11:09.6308716Z  * [new branch]                gh/nikitaved/16/base    -> origin/gh/nikitaved/16/base
2025-12-04T11:11:09.6308798Z  * [new branch]                gh/nikitaved/16/head    -> origin/gh/nikitaved/16/head
2025-12-04T11:11:09.6308870Z  * [new branch]                gh/nikitaved/16/orig    -> origin/gh/nikitaved/16/orig
2025-12-04T11:11:09.6308949Z  * [new branch]                gh/nikitaved/2/base     -> origin/gh/nikitaved/2/base
2025-12-04T11:11:09.6309024Z  * [new branch]                gh/nikitaved/2/head     -> origin/gh/nikitaved/2/head
2025-12-04T11:11:09.6309099Z  * [new branch]                gh/nikitaved/2/orig     -> origin/gh/nikitaved/2/orig
2025-12-04T11:11:09.6309176Z  * [new branch]                gh/nikitaved/4/base     -> origin/gh/nikitaved/4/base
2025-12-04T11:11:09.6309249Z  * [new branch]                gh/nikitaved/4/head     -> origin/gh/nikitaved/4/head
2025-12-04T11:11:09.6309322Z  * [new branch]                gh/nikitaved/4/orig     -> origin/gh/nikitaved/4/orig
2025-12-04T11:11:09.6309397Z  * [new branch]                gh/nikitaved/5/base     -> origin/gh/nikitaved/5/base
2025-12-04T11:11:09.6309471Z  * [new branch]                gh/nikitaved/5/head     -> origin/gh/nikitaved/5/head
2025-12-04T11:11:09.6309550Z  * [new branch]                gh/nikitaved/5/orig     -> origin/gh/nikitaved/5/orig
2025-12-04T11:11:09.6309627Z  * [new branch]                gh/nikitaved/6/base     -> origin/gh/nikitaved/6/base
2025-12-04T11:11:09.6309701Z  * [new branch]                gh/nikitaved/6/head     -> origin/gh/nikitaved/6/head
2025-12-04T11:11:09.6309780Z  * [new branch]                gh/nikitaved/6/orig     -> origin/gh/nikitaved/6/orig
2025-12-04T11:11:09.6309853Z  * [new branch]                gh/nikitaved/8/base     -> origin/gh/nikitaved/8/base
2025-12-04T11:11:09.6309927Z  * [new branch]                gh/nikitaved/8/head     -> origin/gh/nikitaved/8/head
2025-12-04T11:11:09.6310004Z  * [new branch]                gh/nikitaved/8/orig     -> origin/gh/nikitaved/8/orig
2025-12-04T11:11:09.6310077Z  * [new branch]                gh/nikitaved/9/base     -> origin/gh/nikitaved/9/base
2025-12-04T11:11:09.6310153Z  * [new branch]                gh/nikitaved/9/head     -> origin/gh/nikitaved/9/head
2025-12-04T11:11:09.6310232Z  * [new branch]                gh/nikitaved/9/orig     -> origin/gh/nikitaved/9/orig
2025-12-04T11:11:09.6310307Z  * [new branch]                gh/oulgen/10/base       -> origin/gh/oulgen/10/base
2025-12-04T11:11:09.6310378Z  * [new branch]                gh/oulgen/10/head       -> origin/gh/oulgen/10/head
2025-12-04T11:11:09.6310450Z  * [new branch]                gh/oulgen/10/orig       -> origin/gh/oulgen/10/orig
2025-12-04T11:11:09.6310519Z  * [new branch]                gh/oulgen/11/base       -> origin/gh/oulgen/11/base
2025-12-04T11:11:09.6310588Z  * [new branch]                gh/oulgen/11/head       -> origin/gh/oulgen/11/head
2025-12-04T11:11:09.6310659Z  * [new branch]                gh/oulgen/11/orig       -> origin/gh/oulgen/11/orig
2025-12-04T11:11:09.6310727Z  * [new branch]                gh/oulgen/12/base       -> origin/gh/oulgen/12/base
2025-12-04T11:11:09.6310832Z  * [new branch]                gh/oulgen/12/head       -> origin/gh/oulgen/12/head
2025-12-04T11:11:09.6310902Z  * [new branch]                gh/oulgen/12/orig       -> origin/gh/oulgen/12/orig
2025-12-04T11:11:09.6310999Z  * [new branch]                gh/oulgen/13/base       -> origin/gh/oulgen/13/base
2025-12-04T11:11:09.6311071Z  * [new branch]                gh/oulgen/13/head       -> origin/gh/oulgen/13/head
2025-12-04T11:11:09.6311139Z  * [new branch]                gh/oulgen/13/orig       -> origin/gh/oulgen/13/orig
2025-12-04T11:11:09.6311208Z  * [new branch]                gh/oulgen/14/base       -> origin/gh/oulgen/14/base
2025-12-04T11:11:09.6311281Z  * [new branch]                gh/oulgen/14/head       -> origin/gh/oulgen/14/head
2025-12-04T11:11:09.6311350Z  * [new branch]                gh/oulgen/14/orig       -> origin/gh/oulgen/14/orig
2025-12-04T11:11:09.6311419Z  * [new branch]                gh/oulgen/15/base       -> origin/gh/oulgen/15/base
2025-12-04T11:11:09.6311492Z  * [new branch]                gh/oulgen/15/head       -> origin/gh/oulgen/15/head
2025-12-04T11:11:09.6311562Z  * [new branch]                gh/oulgen/15/orig       -> origin/gh/oulgen/15/orig
2025-12-04T11:11:09.6311631Z  * [new branch]                gh/oulgen/16/base       -> origin/gh/oulgen/16/base
2025-12-04T11:11:09.6311704Z  * [new branch]                gh/oulgen/16/head       -> origin/gh/oulgen/16/head
2025-12-04T11:11:09.6311772Z  * [new branch]                gh/oulgen/16/orig       -> origin/gh/oulgen/16/orig
2025-12-04T11:11:09.6311843Z  * [new branch]                gh/oulgen/17/base       -> origin/gh/oulgen/17/base
2025-12-04T11:11:09.6311916Z  * [new branch]                gh/oulgen/17/head       -> origin/gh/oulgen/17/head
2025-12-04T11:11:09.6311985Z  * [new branch]                gh/oulgen/17/orig       -> origin/gh/oulgen/17/orig
2025-12-04T11:11:09.6312055Z  * [new branch]                gh/oulgen/18/base       -> origin/gh/oulgen/18/base
2025-12-04T11:11:09.6312130Z  * [new branch]                gh/oulgen/18/head       -> origin/gh/oulgen/18/head
2025-12-04T11:11:09.6312199Z  * [new branch]                gh/oulgen/18/orig       -> origin/gh/oulgen/18/orig
2025-12-04T11:11:09.6312270Z  * [new branch]                gh/oulgen/19/base       -> origin/gh/oulgen/19/base
2025-12-04T11:11:09.6312343Z  * [new branch]                gh/oulgen/19/head       -> origin/gh/oulgen/19/head
2025-12-04T11:11:09.6312412Z  * [new branch]                gh/oulgen/19/orig       -> origin/gh/oulgen/19/orig
2025-12-04T11:11:09.6312480Z  * [new branch]                gh/oulgen/20/base       -> origin/gh/oulgen/20/base
2025-12-04T11:11:09.6312552Z  * [new branch]                gh/oulgen/20/head       -> origin/gh/oulgen/20/head
2025-12-04T11:11:09.6312621Z  * [new branch]                gh/oulgen/20/orig       -> origin/gh/oulgen/20/orig
2025-12-04T11:11:09.6312693Z  * [new branch]                gh/oulgen/21/base       -> origin/gh/oulgen/21/base
2025-12-04T11:11:09.6312762Z  * [new branch]                gh/oulgen/21/head       -> origin/gh/oulgen/21/head
2025-12-04T11:11:09.6312830Z  * [new branch]                gh/oulgen/21/orig       -> origin/gh/oulgen/21/orig
2025-12-04T11:11:09.6312899Z  * [new branch]                gh/oulgen/22/base       -> origin/gh/oulgen/22/base
2025-12-04T11:11:09.6312969Z  * [new branch]                gh/oulgen/22/head       -> origin/gh/oulgen/22/head
2025-12-04T11:11:09.6313038Z  * [new branch]                gh/oulgen/22/orig       -> origin/gh/oulgen/22/orig
2025-12-04T11:11:09.6313111Z  * [new branch]                gh/oulgen/23/base       -> origin/gh/oulgen/23/base
2025-12-04T11:11:09.6313179Z  * [new branch]                gh/oulgen/23/head       -> origin/gh/oulgen/23/head
2025-12-04T11:11:09.6313247Z  * [new branch]                gh/oulgen/23/orig       -> origin/gh/oulgen/23/orig
2025-12-04T11:11:09.6313319Z  * [new branch]                gh/oulgen/24/base       -> origin/gh/oulgen/24/base
2025-12-04T11:11:09.6313410Z  * [new branch]                gh/oulgen/24/head       -> origin/gh/oulgen/24/head
2025-12-04T11:11:09.6313479Z  * [new branch]                gh/oulgen/24/orig       -> origin/gh/oulgen/24/orig
2025-12-04T11:11:09.6313550Z  * [new branch]                gh/oulgen/25/base       -> origin/gh/oulgen/25/base
2025-12-04T11:11:09.6313641Z  * [new branch]                gh/oulgen/25/head       -> origin/gh/oulgen/25/head
2025-12-04T11:11:09.6313709Z  * [new branch]                gh/oulgen/25/orig       -> origin/gh/oulgen/25/orig
2025-12-04T11:11:09.6313781Z  * [new branch]                gh/oulgen/26/base       -> origin/gh/oulgen/26/base
2025-12-04T11:11:09.6313848Z  * [new branch]                gh/oulgen/26/head       -> origin/gh/oulgen/26/head
2025-12-04T11:11:09.6313916Z  * [new branch]                gh/oulgen/26/orig       -> origin/gh/oulgen/26/orig
2025-12-04T11:11:09.6313989Z  * [new branch]                gh/oulgen/4/base        -> origin/gh/oulgen/4/base
2025-12-04T11:11:09.6314058Z  * [new branch]                gh/oulgen/4/head        -> origin/gh/oulgen/4/head
2025-12-04T11:11:09.6314128Z  * [new branch]                gh/oulgen/4/orig        -> origin/gh/oulgen/4/orig
2025-12-04T11:11:09.6314202Z  * [new branch]                gh/oulgen/7/base        -> origin/gh/oulgen/7/base
2025-12-04T11:11:09.6314272Z  * [new branch]                gh/oulgen/7/head        -> origin/gh/oulgen/7/head
2025-12-04T11:11:09.6314344Z  * [new branch]                gh/oulgen/7/orig        -> origin/gh/oulgen/7/orig
2025-12-04T11:11:09.6314411Z  * [new branch]                gh/oulgen/8/base        -> origin/gh/oulgen/8/base
2025-12-04T11:11:09.6314478Z  * [new branch]                gh/oulgen/8/head        -> origin/gh/oulgen/8/head
2025-12-04T11:11:09.6314549Z  * [new branch]                gh/oulgen/8/orig        -> origin/gh/oulgen/8/orig
2025-12-04T11:11:09.6314617Z  * [new branch]                gh/oulgen/9/base        -> origin/gh/oulgen/9/base
2025-12-04T11:11:09.6314685Z  * [new branch]                gh/oulgen/9/head        -> origin/gh/oulgen/9/head
2025-12-04T11:11:09.6314757Z  * [new branch]                gh/oulgen/9/orig        -> origin/gh/oulgen/9/orig
2025-12-04T11:11:09.6314865Z  * [new branch]                gh/patvig/mtia-serialization -> origin/gh/patvig/mtia-serialization
2025-12-04T11:11:09.6314939Z  * [new branch]                gh/pearu/108/base       -> origin/gh/pearu/108/base
2025-12-04T11:11:09.6315012Z  * [new branch]                gh/pearu/108/head       -> origin/gh/pearu/108/head
2025-12-04T11:11:09.6315081Z  * [new branch]                gh/pearu/108/orig       -> origin/gh/pearu/108/orig
2025-12-04T11:11:09.6315151Z  * [new branch]                gh/pearu/109/base       -> origin/gh/pearu/109/base
2025-12-04T11:11:09.6315222Z  * [new branch]                gh/pearu/109/head       -> origin/gh/pearu/109/head
2025-12-04T11:11:09.6315291Z  * [new branch]                gh/pearu/109/orig       -> origin/gh/pearu/109/orig
2025-12-04T11:11:09.6315360Z  * [new branch]                gh/pearu/110/base       -> origin/gh/pearu/110/base
2025-12-04T11:11:09.6315433Z  * [new branch]                gh/pearu/110/head       -> origin/gh/pearu/110/head
2025-12-04T11:11:09.6315502Z  * [new branch]                gh/pearu/110/orig       -> origin/gh/pearu/110/orig
2025-12-04T11:11:09.6315572Z  * [new branch]                gh/pearu/111/base       -> origin/gh/pearu/111/base
2025-12-04T11:11:09.6315643Z  * [new branch]                gh/pearu/111/head       -> origin/gh/pearu/111/head
2025-12-04T11:11:09.6315714Z  * [new branch]                gh/pearu/111/orig       -> origin/gh/pearu/111/orig
2025-12-04T11:11:09.6315784Z  * [new branch]                gh/pearu/112/base       -> origin/gh/pearu/112/base
2025-12-04T11:11:09.6315856Z  * [new branch]                gh/pearu/112/head       -> origin/gh/pearu/112/head
2025-12-04T11:11:09.6315925Z  * [new branch]                gh/pearu/112/orig       -> origin/gh/pearu/112/orig
2025-12-04T11:11:09.6315997Z  * [new branch]                gh/pearu/115/base       -> origin/gh/pearu/115/base
2025-12-04T11:11:09.6316090Z  * [new branch]                gh/pearu/115/head       -> origin/gh/pearu/115/head
2025-12-04T11:11:09.6316160Z  * [new branch]                gh/pearu/115/orig       -> origin/gh/pearu/115/orig
2025-12-04T11:11:09.6316256Z  * [new branch]                gh/pearu/116/base       -> origin/gh/pearu/116/base
2025-12-04T11:11:09.6316326Z  * [new branch]                gh/pearu/116/head       -> origin/gh/pearu/116/head
2025-12-04T11:11:09.6316395Z  * [new branch]                gh/pearu/116/orig       -> origin/gh/pearu/116/orig
2025-12-04T11:11:09.6316469Z  * [new branch]                gh/pearu/117/base       -> origin/gh/pearu/117/base
2025-12-04T11:11:09.6316538Z  * [new branch]                gh/pearu/117/head       -> origin/gh/pearu/117/head
2025-12-04T11:11:09.6316606Z  * [new branch]                gh/pearu/117/orig       -> origin/gh/pearu/117/orig
2025-12-04T11:11:09.6316677Z  * [new branch]                gh/pearu/118/base       -> origin/gh/pearu/118/base
2025-12-04T11:11:09.6316747Z  * [new branch]                gh/pearu/118/head       -> origin/gh/pearu/118/head
2025-12-04T11:11:09.6316817Z  * [new branch]                gh/pearu/118/orig       -> origin/gh/pearu/118/orig
2025-12-04T11:11:09.6316889Z  * [new branch]                gh/pearu/119/base       -> origin/gh/pearu/119/base
2025-12-04T11:11:09.6316959Z  * [new branch]                gh/pearu/119/head       -> origin/gh/pearu/119/head
2025-12-04T11:11:09.6317027Z  * [new branch]                gh/pearu/119/orig       -> origin/gh/pearu/119/orig
2025-12-04T11:11:09.6317098Z  * [new branch]                gh/pearu/139/base       -> origin/gh/pearu/139/base
2025-12-04T11:11:09.6317167Z  * [new branch]                gh/pearu/139/head       -> origin/gh/pearu/139/head
2025-12-04T11:11:09.6317237Z  * [new branch]                gh/pearu/139/orig       -> origin/gh/pearu/139/orig
2025-12-04T11:11:09.6317308Z  * [new branch]                gh/pearu/140/base       -> origin/gh/pearu/140/base
2025-12-04T11:11:09.6317378Z  * [new branch]                gh/pearu/140/head       -> origin/gh/pearu/140/head
2025-12-04T11:11:09.6317447Z  * [new branch]                gh/pearu/140/orig       -> origin/gh/pearu/140/orig
2025-12-04T11:11:09.6317520Z  * [new branch]                gh/pearu/142/base       -> origin/gh/pearu/142/base
2025-12-04T11:11:09.6317591Z  * [new branch]                gh/pearu/142/head       -> origin/gh/pearu/142/head
2025-12-04T11:11:09.6317666Z  * [new branch]                gh/pearu/142/orig       -> origin/gh/pearu/142/orig
2025-12-04T11:11:09.6317735Z  * [new branch]                gh/pearu/143/base       -> origin/gh/pearu/143/base
2025-12-04T11:11:09.6317805Z  * [new branch]                gh/pearu/143/head       -> origin/gh/pearu/143/head
2025-12-04T11:11:09.6317878Z  * [new branch]                gh/pearu/143/orig       -> origin/gh/pearu/143/orig
2025-12-04T11:11:09.6317949Z  * [new branch]                gh/pearu/147/base       -> origin/gh/pearu/147/base
2025-12-04T11:11:09.6318019Z  * [new branch]                gh/pearu/147/head       -> origin/gh/pearu/147/head
2025-12-04T11:11:09.6318092Z  * [new branch]                gh/pearu/147/orig       -> origin/gh/pearu/147/orig
2025-12-04T11:11:09.6318199Z  * [new branch]                gh/pearu/149/base       -> origin/gh/pearu/149/base
2025-12-04T11:11:09.6318269Z  * [new branch]                gh/pearu/149/head       -> origin/gh/pearu/149/head
2025-12-04T11:11:09.6318340Z  * [new branch]                gh/pearu/149/orig       -> origin/gh/pearu/149/orig
2025-12-04T11:11:09.6318408Z  * [new branch]                gh/pearu/150/base       -> origin/gh/pearu/150/base
2025-12-04T11:11:09.6318477Z  * [new branch]                gh/pearu/150/head       -> origin/gh/pearu/150/head
2025-12-04T11:11:09.6318549Z  * [new branch]                gh/pearu/150/orig       -> origin/gh/pearu/150/orig
2025-12-04T11:11:09.6318619Z  * [new branch]                gh/pearu/151/base       -> origin/gh/pearu/151/base
2025-12-04T11:11:09.6318727Z  * [new branch]                gh/pearu/151/head       -> origin/gh/pearu/151/head
2025-12-04T11:11:09.6318800Z  * [new branch]                gh/pearu/151/orig       -> origin/gh/pearu/151/orig
2025-12-04T11:11:09.6318869Z  * [new branch]                gh/pearu/152/base       -> origin/gh/pearu/152/base
2025-12-04T11:11:09.6318966Z  * [new branch]                gh/pearu/152/head       -> origin/gh/pearu/152/head
2025-12-04T11:11:09.6319039Z  * [new branch]                gh/pearu/152/orig       -> origin/gh/pearu/152/orig
2025-12-04T11:11:09.6319108Z  * [new branch]                gh/pearu/153/base       -> origin/gh/pearu/153/base
2025-12-04T11:11:09.6319176Z  * [new branch]                gh/pearu/153/head       -> origin/gh/pearu/153/head
2025-12-04T11:11:09.6319246Z  * [new branch]                gh/pearu/153/orig       -> origin/gh/pearu/153/orig
2025-12-04T11:11:09.6319317Z  * [new branch]                gh/pearu/154/base       -> origin/gh/pearu/154/base
2025-12-04T11:11:09.6319386Z  * [new branch]                gh/pearu/154/head       -> origin/gh/pearu/154/head
2025-12-04T11:11:09.6319458Z  * [new branch]                gh/pearu/154/orig       -> origin/gh/pearu/154/orig
2025-12-04T11:11:09.6319527Z  * [new branch]                gh/pearu/155/base       -> origin/gh/pearu/155/base
2025-12-04T11:11:09.6319600Z  * [new branch]                gh/pearu/155/head       -> origin/gh/pearu/155/head
2025-12-04T11:11:09.6319667Z  * [new branch]                gh/pearu/155/orig       -> origin/gh/pearu/155/orig
2025-12-04T11:11:09.6319736Z  * [new branch]                gh/pearu/156/base       -> origin/gh/pearu/156/base
2025-12-04T11:11:09.6319808Z  * [new branch]                gh/pearu/156/head       -> origin/gh/pearu/156/head
2025-12-04T11:11:09.6319877Z  * [new branch]                gh/pearu/156/orig       -> origin/gh/pearu/156/orig
2025-12-04T11:11:09.6319945Z  * [new branch]                gh/pearu/56/base        -> origin/gh/pearu/56/base
2025-12-04T11:11:09.6320016Z  * [new branch]                gh/pearu/56/head        -> origin/gh/pearu/56/head
2025-12-04T11:11:09.6320086Z  * [new branch]                gh/pearu/56/orig        -> origin/gh/pearu/56/orig
2025-12-04T11:11:09.6320154Z  * [new branch]                gh/pearu/97/base        -> origin/gh/pearu/97/base
2025-12-04T11:11:09.6320227Z  * [new branch]                gh/pearu/97/head        -> origin/gh/pearu/97/head
2025-12-04T11:11:09.6320294Z  * [new branch]                gh/pearu/97/orig        -> origin/gh/pearu/97/orig
2025-12-04T11:11:09.6320373Z  * [new branch]                gh/pianpwk/21/base      -> origin/gh/pianpwk/21/base
2025-12-04T11:11:09.6320450Z  * [new branch]                gh/pianpwk/21/head      -> origin/gh/pianpwk/21/head
2025-12-04T11:11:09.6320523Z  * [new branch]                gh/pianpwk/28/base      -> origin/gh/pianpwk/28/base
2025-12-04T11:11:09.6320595Z  * [new branch]                gh/pianpwk/28/head      -> origin/gh/pianpwk/28/head
2025-12-04T11:11:09.6320670Z  * [new branch]                gh/pianpwk/28/orig      -> origin/gh/pianpwk/28/orig
2025-12-04T11:11:09.6320746Z  * [new branch]                gh/pianpwk/29/base      -> origin/gh/pianpwk/29/base
2025-12-04T11:11:09.6320817Z  * [new branch]                gh/pianpwk/29/head      -> origin/gh/pianpwk/29/head
2025-12-04T11:11:09.6320892Z  * [new branch]                gh/pianpwk/29/orig      -> origin/gh/pianpwk/29/orig
2025-12-04T11:11:09.6320964Z  * [new branch]                gh/pianpwk/30/base      -> origin/gh/pianpwk/30/base
2025-12-04T11:11:09.6321036Z  * [new branch]                gh/pianpwk/30/head      -> origin/gh/pianpwk/30/head
2025-12-04T11:11:09.6321113Z  * [new branch]                gh/pianpwk/30/orig      -> origin/gh/pianpwk/30/orig
2025-12-04T11:11:09.6321184Z  * [new branch]                gh/pianpwk/31/base      -> origin/gh/pianpwk/31/base
2025-12-04T11:11:09.6321259Z  * [new branch]                gh/pianpwk/31/head      -> origin/gh/pianpwk/31/head
2025-12-04T11:11:09.6321331Z  * [new branch]                gh/pianpwk/31/orig      -> origin/gh/pianpwk/31/orig
2025-12-04T11:11:09.6321425Z  * [new branch]                gh/pianpwk/32/base      -> origin/gh/pianpwk/32/base
2025-12-04T11:11:09.6321501Z  * [new branch]                gh/pianpwk/32/head      -> origin/gh/pianpwk/32/head
2025-12-04T11:11:09.6321593Z  * [new branch]                gh/pianpwk/32/orig      -> origin/gh/pianpwk/32/orig
2025-12-04T11:11:09.6321664Z  * [new branch]                gh/pianpwk/33/base      -> origin/gh/pianpwk/33/base
2025-12-04T11:11:09.6321739Z  * [new branch]                gh/pianpwk/33/head      -> origin/gh/pianpwk/33/head
2025-12-04T11:11:09.6321810Z  * [new branch]                gh/pianpwk/33/orig      -> origin/gh/pianpwk/33/orig
2025-12-04T11:11:09.6321881Z  * [new branch]                gh/pianpwk/34/base      -> origin/gh/pianpwk/34/base
2025-12-04T11:11:09.6321956Z  * [new branch]                gh/pianpwk/34/head      -> origin/gh/pianpwk/34/head
2025-12-04T11:11:09.6322026Z  * [new branch]                gh/pianpwk/34/orig      -> origin/gh/pianpwk/34/orig
2025-12-04T11:11:09.6322098Z  * [new branch]                gh/pianpwk/35/base      -> origin/gh/pianpwk/35/base
2025-12-04T11:11:09.6322171Z  * [new branch]                gh/pianpwk/35/head      -> origin/gh/pianpwk/35/head
2025-12-04T11:11:09.6322242Z  * [new branch]                gh/pianpwk/35/orig      -> origin/gh/pianpwk/35/orig
2025-12-04T11:11:09.6322314Z  * [new branch]                gh/rec/141/base         -> origin/gh/rec/141/base
2025-12-04T11:11:09.6322386Z  * [new branch]                gh/rec/141/head         -> origin/gh/rec/141/head
2025-12-04T11:11:09.6322454Z  * [new branch]                gh/rec/153/base         -> origin/gh/rec/153/base
2025-12-04T11:11:09.6322519Z  * [new branch]                gh/rec/153/head         -> origin/gh/rec/153/head
2025-12-04T11:11:09.6322589Z  * [new branch]                gh/rec/153/orig         -> origin/gh/rec/153/orig
2025-12-04T11:11:09.6322655Z  * [new branch]                gh/rec/154/base         -> origin/gh/rec/154/base
2025-12-04T11:11:09.6322721Z  * [new branch]                gh/rec/154/head         -> origin/gh/rec/154/head
2025-12-04T11:11:09.6322790Z  * [new branch]                gh/rec/154/orig         -> origin/gh/rec/154/orig
2025-12-04T11:11:09.6322857Z  * [new branch]                gh/rec/164/base         -> origin/gh/rec/164/base
2025-12-04T11:11:09.6322924Z  * [new branch]                gh/rec/164/head         -> origin/gh/rec/164/head
2025-12-04T11:11:09.6322992Z  * [new branch]                gh/rec/164/orig         -> origin/gh/rec/164/orig
2025-12-04T11:11:09.6323058Z  * [new branch]                gh/rec/166/base         -> origin/gh/rec/166/base
2025-12-04T11:11:09.6323127Z  * [new branch]                gh/rec/166/head         -> origin/gh/rec/166/head
2025-12-04T11:11:09.6323192Z  * [new branch]                gh/rec/166/orig         -> origin/gh/rec/166/orig
2025-12-04T11:11:09.6323258Z  * [new branch]                gh/rec/167/base         -> origin/gh/rec/167/base
2025-12-04T11:11:09.6323328Z  * [new branch]                gh/rec/167/head         -> origin/gh/rec/167/head
2025-12-04T11:11:09.6323396Z  * [new branch]                gh/rec/167/orig         -> origin/gh/rec/167/orig
2025-12-04T11:11:09.6323461Z  * [new branch]                gh/rec/168/base         -> origin/gh/rec/168/base
2025-12-04T11:11:09.6323533Z  * [new branch]                gh/rec/168/head         -> origin/gh/rec/168/head
2025-12-04T11:11:09.6323599Z  * [new branch]                gh/rec/168/orig         -> origin/gh/rec/168/orig
2025-12-04T11:11:09.6323665Z  * [new branch]                gh/rec/169/base         -> origin/gh/rec/169/base
2025-12-04T11:11:09.6323735Z  * [new branch]                gh/rec/169/head         -> origin/gh/rec/169/head
2025-12-04T11:11:09.6323799Z  * [new branch]                gh/rec/169/orig         -> origin/gh/rec/169/orig
2025-12-04T11:11:09.6323865Z  * [new branch]                gh/rec/170/base         -> origin/gh/rec/170/base
2025-12-04T11:11:09.6323933Z  * [new branch]                gh/rec/170/head         -> origin/gh/rec/170/head
2025-12-04T11:11:09.6324018Z  * [new branch]                gh/rec/170/orig         -> origin/gh/rec/170/orig
2025-12-04T11:11:09.6324085Z  * [new branch]                gh/rec/171/base         -> origin/gh/rec/171/base
2025-12-04T11:11:09.6324174Z  * [new branch]                gh/rec/171/head         -> origin/gh/rec/171/head
2025-12-04T11:11:09.6324239Z  * [new branch]                gh/rec/171/orig         -> origin/gh/rec/171/orig
2025-12-04T11:11:09.6324304Z  * [new branch]                gh/rec/172/base         -> origin/gh/rec/172/base
2025-12-04T11:11:09.6324373Z  * [new branch]                gh/rec/172/head         -> origin/gh/rec/172/head
2025-12-04T11:11:09.6324438Z  * [new branch]                gh/rec/172/orig         -> origin/gh/rec/172/orig
2025-12-04T11:11:09.6324505Z  * [new branch]                gh/rec/173/base         -> origin/gh/rec/173/base
2025-12-04T11:11:09.6324573Z  * [new branch]                gh/rec/173/head         -> origin/gh/rec/173/head
2025-12-04T11:11:09.6324640Z  * [new branch]                gh/rec/173/orig         -> origin/gh/rec/173/orig
2025-12-04T11:11:09.6324705Z  * [new branch]                gh/rec/174/base         -> origin/gh/rec/174/base
2025-12-04T11:11:09.6324774Z  * [new branch]                gh/rec/174/head         -> origin/gh/rec/174/head
2025-12-04T11:11:09.6324842Z  * [new branch]                gh/rec/174/orig         -> origin/gh/rec/174/orig
2025-12-04T11:11:09.6324911Z  * [new branch]                gh/rec/175/base         -> origin/gh/rec/175/base
2025-12-04T11:11:09.6324977Z  * [new branch]                gh/rec/175/head         -> origin/gh/rec/175/head
2025-12-04T11:11:09.6325042Z  * [new branch]                gh/rec/175/orig         -> origin/gh/rec/175/orig
2025-12-04T11:11:09.6325110Z  * [new branch]                gh/rec/176/base         -> origin/gh/rec/176/base
2025-12-04T11:11:09.6325173Z  * [new branch]                gh/rec/176/head         -> origin/gh/rec/176/head
2025-12-04T11:11:09.6325239Z  * [new branch]                gh/rec/176/orig         -> origin/gh/rec/176/orig
2025-12-04T11:11:09.6325307Z  * [new branch]                gh/rec/177/base         -> origin/gh/rec/177/base
2025-12-04T11:11:09.6325373Z  * [new branch]                gh/rec/177/head         -> origin/gh/rec/177/head
2025-12-04T11:11:09.6325441Z  * [new branch]                gh/rec/177/orig         -> origin/gh/rec/177/orig
2025-12-04T11:11:09.6325535Z  * [new branch]                gh/robert-hardwick/3/base -> origin/gh/robert-hardwick/3/base
2025-12-04T11:11:09.6325623Z  * [new branch]                gh/robert-hardwick/3/head -> origin/gh/robert-hardwick/3/head
2025-12-04T11:11:09.6325708Z  * [new branch]                gh/robert-hardwick/3/orig -> origin/gh/robert-hardwick/3/orig
2025-12-04T11:11:09.6325795Z  * [new branch]                gh/robert-hardwick/4/base -> origin/gh/robert-hardwick/4/base
2025-12-04T11:11:09.6325880Z  * [new branch]                gh/robert-hardwick/4/head -> origin/gh/robert-hardwick/4/head
2025-12-04T11:11:09.6325965Z  * [new branch]                gh/robert-hardwick/4/orig -> origin/gh/robert-hardwick/4/orig
2025-12-04T11:11:09.6326052Z  * [new branch]                gh/robert-hardwick/5/base -> origin/gh/robert-hardwick/5/base
2025-12-04T11:11:09.6326136Z  * [new branch]                gh/robert-hardwick/5/head -> origin/gh/robert-hardwick/5/head
2025-12-04T11:11:09.6326221Z  * [new branch]                gh/robert-hardwick/5/orig -> origin/gh/robert-hardwick/5/orig
2025-12-04T11:11:09.6326310Z  * [new branch]                gh/robert-hardwick/6/base -> origin/gh/robert-hardwick/6/base
2025-12-04T11:11:09.6326394Z  * [new branch]                gh/robert-hardwick/6/head -> origin/gh/robert-hardwick/6/head
2025-12-04T11:11:09.6326478Z  * [new branch]                gh/robert-hardwick/6/orig -> origin/gh/robert-hardwick/6/orig
2025-12-04T11:11:09.6326564Z  * [new branch]                gh/robert-hardwick/7/base -> origin/gh/robert-hardwick/7/base
2025-12-04T11:11:09.6326648Z  * [new branch]                gh/robert-hardwick/7/head -> origin/gh/robert-hardwick/7/head
2025-12-04T11:11:09.6326754Z  * [new branch]                gh/robert-hardwick/7/orig -> origin/gh/robert-hardwick/7/orig
2025-12-04T11:11:09.6326841Z  * [new branch]                gh/robert-hardwick/8/base -> origin/gh/robert-hardwick/8/base
2025-12-04T11:11:09.6326950Z  * [new branch]                gh/robert-hardwick/8/head -> origin/gh/robert-hardwick/8/head
2025-12-04T11:11:09.6327038Z  * [new branch]                gh/robert-hardwick/8/orig -> origin/gh/robert-hardwick/8/orig
2025-12-04T11:11:09.6327121Z  * [new branch]                gh/robert-hardwick/9/base -> origin/gh/robert-hardwick/9/base
2025-12-04T11:11:09.6327207Z  * [new branch]                gh/robert-hardwick/9/head -> origin/gh/robert-hardwick/9/head
2025-12-04T11:11:09.6327295Z  * [new branch]                gh/robert-hardwick/9/orig -> origin/gh/robert-hardwick/9/orig
2025-12-04T11:11:09.6327367Z  * [new branch]                gh/rtimpe/1/base        -> origin/gh/rtimpe/1/base
2025-12-04T11:11:09.6327440Z  * [new branch]                gh/rtimpe/1/head        -> origin/gh/rtimpe/1/head
2025-12-04T11:11:09.6327513Z  * [new branch]                gh/rtimpe/2/base        -> origin/gh/rtimpe/2/base
2025-12-04T11:11:09.6327581Z  * [new branch]                gh/rtimpe/2/head        -> origin/gh/rtimpe/2/head
2025-12-04T11:11:09.6327654Z  * [new branch]                gh/rtimpe/22/base       -> origin/gh/rtimpe/22/base
2025-12-04T11:11:09.6327727Z  * [new branch]                gh/rtimpe/22/head       -> origin/gh/rtimpe/22/head
2025-12-04T11:11:09.6327798Z  * [new branch]                gh/rtimpe/22/orig       -> origin/gh/rtimpe/22/orig
2025-12-04T11:11:09.6327867Z  * [new branch]                gh/rtimpe/23/base       -> origin/gh/rtimpe/23/base
2025-12-04T11:11:09.6327940Z  * [new branch]                gh/rtimpe/23/head       -> origin/gh/rtimpe/23/head
2025-12-04T11:11:09.6328010Z  * [new branch]                gh/rtimpe/23/orig       -> origin/gh/rtimpe/23/orig
2025-12-04T11:11:09.6328080Z  * [new branch]                gh/rtimpe/24/base       -> origin/gh/rtimpe/24/base
2025-12-04T11:11:09.6328194Z  * [new branch]                gh/rtimpe/24/head       -> origin/gh/rtimpe/24/head
2025-12-04T11:11:09.6328264Z  * [new branch]                gh/rtimpe/24/orig       -> origin/gh/rtimpe/24/orig
2025-12-04T11:11:09.6328338Z  * [new branch]                gh/rtimpe/25/base       -> origin/gh/rtimpe/25/base
2025-12-04T11:11:09.6328407Z  * [new branch]                gh/rtimpe/25/head       -> origin/gh/rtimpe/25/head
2025-12-04T11:11:09.6328477Z  * [new branch]                gh/rtimpe/25/orig       -> origin/gh/rtimpe/25/orig
2025-12-04T11:11:09.6328548Z  * [new branch]                gh/rtimpe/26/base       -> origin/gh/rtimpe/26/base
2025-12-04T11:11:09.6328617Z  * [new branch]                gh/rtimpe/26/head       -> origin/gh/rtimpe/26/head
2025-12-04T11:11:09.6328686Z  * [new branch]                gh/rtimpe/26/orig       -> origin/gh/rtimpe/26/orig
2025-12-04T11:11:09.6328760Z  * [new branch]                gh/rtimpe/27/base       -> origin/gh/rtimpe/27/base
2025-12-04T11:11:09.6328829Z  * [new branch]                gh/rtimpe/27/head       -> origin/gh/rtimpe/27/head
2025-12-04T11:11:09.6328898Z  * [new branch]                gh/rtimpe/27/orig       -> origin/gh/rtimpe/27/orig
2025-12-04T11:11:09.6328972Z  * [new branch]                gh/rtimpe/28/base       -> origin/gh/rtimpe/28/base
2025-12-04T11:11:09.6329041Z  * [new branch]                gh/rtimpe/28/head       -> origin/gh/rtimpe/28/head
2025-12-04T11:11:09.6329109Z  * [new branch]                gh/rtimpe/28/orig       -> origin/gh/rtimpe/28/orig
2025-12-04T11:11:09.6329182Z  * [new branch]                gh/rtimpe/29/base       -> origin/gh/rtimpe/29/base
2025-12-04T11:11:09.6329254Z  * [new branch]                gh/rtimpe/29/head       -> origin/gh/rtimpe/29/head
2025-12-04T11:11:09.6329325Z  * [new branch]                gh/rtimpe/29/orig       -> origin/gh/rtimpe/29/orig
2025-12-04T11:11:09.6329397Z  * [new branch]                gh/rtimpe/3/base        -> origin/gh/rtimpe/3/base
2025-12-04T11:11:09.6329499Z  * [new branch]                gh/rtimpe/3/head        -> origin/gh/rtimpe/3/head
2025-12-04T11:11:09.6329571Z  * [new branch]                gh/rtimpe/30/base       -> origin/gh/rtimpe/30/base
2025-12-04T11:11:09.6329668Z  * [new branch]                gh/rtimpe/30/head       -> origin/gh/rtimpe/30/head
2025-12-04T11:11:09.6329738Z  * [new branch]                gh/rtimpe/30/orig       -> origin/gh/rtimpe/30/orig
2025-12-04T11:11:09.6329806Z  * [new branch]                gh/rtimpe/31/base       -> origin/gh/rtimpe/31/base
2025-12-04T11:11:09.6329878Z  * [new branch]                gh/rtimpe/31/head       -> origin/gh/rtimpe/31/head
2025-12-04T11:11:09.6329946Z  * [new branch]                gh/rtimpe/31/orig       -> origin/gh/rtimpe/31/orig
2025-12-04T11:11:09.6330019Z  * [new branch]                gh/rtimpe/32/base       -> origin/gh/rtimpe/32/base
2025-12-04T11:11:09.6330088Z  * [new branch]                gh/rtimpe/32/head       -> origin/gh/rtimpe/32/head
2025-12-04T11:11:09.6330159Z  * [new branch]                gh/rtimpe/32/orig       -> origin/gh/rtimpe/32/orig
2025-12-04T11:11:09.6330230Z  * [new branch]                gh/rtimpe/33/base       -> origin/gh/rtimpe/33/base
2025-12-04T11:11:09.6330303Z  * [new branch]                gh/rtimpe/33/head       -> origin/gh/rtimpe/33/head
2025-12-04T11:11:09.6330372Z  * [new branch]                gh/rtimpe/33/orig       -> origin/gh/rtimpe/33/orig
2025-12-04T11:11:09.6330447Z  * [new branch]                gh/rtimpe/34/base       -> origin/gh/rtimpe/34/base
2025-12-04T11:11:09.6330516Z  * [new branch]                gh/rtimpe/34/head       -> origin/gh/rtimpe/34/head
2025-12-04T11:11:09.6330585Z  * [new branch]                gh/rtimpe/34/orig       -> origin/gh/rtimpe/34/orig
2025-12-04T11:11:09.6330656Z  * [new branch]                gh/rtimpe/35/base       -> origin/gh/rtimpe/35/base
2025-12-04T11:11:09.6330724Z  * [new branch]                gh/rtimpe/35/head       -> origin/gh/rtimpe/35/head
2025-12-04T11:11:09.6330795Z  * [new branch]                gh/rtimpe/35/orig       -> origin/gh/rtimpe/35/orig
2025-12-04T11:11:09.6330868Z  * [new branch]                gh/rtimpe/4/base        -> origin/gh/rtimpe/4/base
2025-12-04T11:11:09.6330938Z  * [new branch]                gh/rtimpe/4/head        -> origin/gh/rtimpe/4/head
2025-12-04T11:11:09.6331026Z  * [new branch]                gh/ruisizhang123/1/base -> origin/gh/ruisizhang123/1/base
2025-12-04T11:11:09.6331112Z  * [new branch]                gh/ruisizhang123/1/head -> origin/gh/ruisizhang123/1/head
2025-12-04T11:11:09.6331191Z  * [new branch]                gh/ruisizhang123/1/orig -> origin/gh/ruisizhang123/1/orig
2025-12-04T11:11:09.6331271Z  * [new branch]                gh/ruisizhang123/4/base -> origin/gh/ruisizhang123/4/base
2025-12-04T11:11:09.6331353Z  * [new branch]                gh/ruisizhang123/4/head -> origin/gh/ruisizhang123/4/head
2025-12-04T11:11:09.6331430Z  * [new branch]                gh/ruisizhang123/4/orig -> origin/gh/ruisizhang123/4/orig
2025-12-04T11:11:09.6331510Z  * [new branch]                gh/ruisizhang123/5/base -> origin/gh/ruisizhang123/5/base
2025-12-04T11:11:09.6331597Z  * [new branch]                gh/ruisizhang123/5/head -> origin/gh/ruisizhang123/5/head
2025-12-04T11:11:09.6331681Z  * [new branch]                gh/ruisizhang123/5/orig -> origin/gh/ruisizhang123/5/orig
2025-12-04T11:11:09.6331765Z  * [new branch]                gh/ruisizhang123/6/base -> origin/gh/ruisizhang123/6/base
2025-12-04T11:11:09.6331844Z  * [new branch]                gh/ruisizhang123/6/head -> origin/gh/ruisizhang123/6/head
2025-12-04T11:11:09.6331922Z  * [new branch]                gh/ruisizhang123/6/orig -> origin/gh/ruisizhang123/6/orig
2025-12-04T11:11:09.6332004Z  * [new branch]                gh/ruisizhang123/7/base -> origin/gh/ruisizhang123/7/base
2025-12-04T11:11:09.6332081Z  * [new branch]                gh/ruisizhang123/7/head -> origin/gh/ruisizhang123/7/head
2025-12-04T11:11:09.6332182Z  * [new branch]                gh/ruisizhang123/7/orig -> origin/gh/ruisizhang123/7/orig
2025-12-04T11:11:09.6332263Z  * [new branch]                gh/ruisizhang123/8/base -> origin/gh/ruisizhang123/8/base
2025-12-04T11:11:09.6332343Z  * [new branch]                gh/ruisizhang123/8/head -> origin/gh/ruisizhang123/8/head
2025-12-04T11:11:09.6332446Z  * [new branch]                gh/ruisizhang123/8/orig -> origin/gh/ruisizhang123/8/orig
2025-12-04T11:11:09.6332526Z  * [new branch]                gh/ruisizhang123/9/base -> origin/gh/ruisizhang123/9/base
2025-12-04T11:11:09.6332604Z  * [new branch]                gh/ruisizhang123/9/head -> origin/gh/ruisizhang123/9/head
2025-12-04T11:11:09.6332681Z  * [new branch]                gh/ruisizhang123/9/orig -> origin/gh/ruisizhang123/9/orig
2025-12-04T11:11:09.6332765Z  * [new branch]                gh/seemethere/52/base   -> origin/gh/seemethere/52/base
2025-12-04T11:11:09.6332843Z  * [new branch]                gh/seemethere/52/head   -> origin/gh/seemethere/52/head
2025-12-04T11:11:09.6332920Z  * [new branch]                gh/seemethere/52/orig   -> origin/gh/seemethere/52/orig
2025-12-04T11:11:09.6333001Z  * [new branch]                gh/seemethere/53/base   -> origin/gh/seemethere/53/base
2025-12-04T11:11:09.6333077Z  * [new branch]                gh/seemethere/53/head   -> origin/gh/seemethere/53/head
2025-12-04T11:11:09.6333155Z  * [new branch]                gh/seemethere/53/orig   -> origin/gh/seemethere/53/orig
2025-12-04T11:11:09.6333232Z  * [new branch]                gh/seemethere/54/base   -> origin/gh/seemethere/54/base
2025-12-04T11:11:09.6333307Z  * [new branch]                gh/seemethere/54/head   -> origin/gh/seemethere/54/head
2025-12-04T11:11:09.6333381Z  * [new branch]                gh/seemethere/54/orig   -> origin/gh/seemethere/54/orig
2025-12-04T11:11:09.6333459Z  * [new branch]                gh/seemethere/55/base   -> origin/gh/seemethere/55/base
2025-12-04T11:11:09.6333534Z  * [new branch]                gh/seemethere/55/head   -> origin/gh/seemethere/55/head
2025-12-04T11:11:09.6333615Z  * [new branch]                gh/seemethere/55/orig   -> origin/gh/seemethere/55/orig
2025-12-04T11:11:09.6333691Z  * [new branch]                gh/seemethere/59/base   -> origin/gh/seemethere/59/base
2025-12-04T11:11:09.6333769Z  * [new branch]                gh/seemethere/59/head   -> origin/gh/seemethere/59/head
2025-12-04T11:11:09.6333847Z  * [new branch]                gh/seemethere/59/orig   -> origin/gh/seemethere/59/orig
2025-12-04T11:11:09.6333923Z  * [new branch]                gh/seemethere/62/base   -> origin/gh/seemethere/62/base
2025-12-04T11:11:09.6333999Z  * [new branch]                gh/seemethere/62/head   -> origin/gh/seemethere/62/head
2025-12-04T11:11:09.6334078Z  * [new branch]                gh/seemethere/62/orig   -> origin/gh/seemethere/62/orig
2025-12-04T11:11:09.6334155Z  * [new branch]                gh/seemethere/63/base   -> origin/gh/seemethere/63/base
2025-12-04T11:11:09.6334231Z  * [new branch]                gh/seemethere/63/head   -> origin/gh/seemethere/63/head
2025-12-04T11:11:09.6334313Z  * [new branch]                gh/seemethere/63/orig   -> origin/gh/seemethere/63/orig
2025-12-04T11:11:09.6334388Z  * [new branch]                gh/seemethere/71/base   -> origin/gh/seemethere/71/base
2025-12-04T11:11:09.6334465Z  * [new branch]                gh/seemethere/71/head   -> origin/gh/seemethere/71/head
2025-12-04T11:11:09.6334546Z  * [new branch]                gh/seemethere/71/orig   -> origin/gh/seemethere/71/orig
2025-12-04T11:11:09.6334621Z  * [new branch]                gh/seemethere/72/base   -> origin/gh/seemethere/72/base
2025-12-04T11:11:09.6334696Z  * [new branch]                gh/seemethere/72/head   -> origin/gh/seemethere/72/head
2025-12-04T11:11:09.6334774Z  * [new branch]                gh/seemethere/72/orig   -> origin/gh/seemethere/72/orig
2025-12-04T11:11:09.6334850Z  * [new branch]                gh/seemethere/73/base   -> origin/gh/seemethere/73/base
2025-12-04T11:11:09.6334926Z  * [new branch]                gh/seemethere/73/head   -> origin/gh/seemethere/73/head
2025-12-04T11:11:09.6335026Z  * [new branch]                gh/seemethere/73/orig   -> origin/gh/seemethere/73/orig
2025-12-04T11:11:09.6335102Z  * [new branch]                gh/seemethere/74/base   -> origin/gh/seemethere/74/base
2025-12-04T11:11:09.6335203Z  * [new branch]                gh/seemethere/74/head   -> origin/gh/seemethere/74/head
2025-12-04T11:11:09.6335279Z  * [new branch]                gh/seemethere/74/orig   -> origin/gh/seemethere/74/orig
2025-12-04T11:11:09.6335354Z  * [new branch]                gh/seemethere/75/base   -> origin/gh/seemethere/75/base
2025-12-04T11:11:09.6335432Z  * [new branch]                gh/seemethere/75/head   -> origin/gh/seemethere/75/head
2025-12-04T11:11:09.6335509Z  * [new branch]                gh/seemethere/75/orig   -> origin/gh/seemethere/75/orig
2025-12-04T11:11:09.6335585Z  * [new branch]                gh/seemethere/76/base   -> origin/gh/seemethere/76/base
2025-12-04T11:11:09.6335665Z  * [new branch]                gh/seemethere/76/head   -> origin/gh/seemethere/76/head
2025-12-04T11:11:09.6335741Z  * [new branch]                gh/seemethere/76/orig   -> origin/gh/seemethere/76/orig
2025-12-04T11:11:09.6335821Z  * [new branch]                gh/shunting314/145/base -> origin/gh/shunting314/145/base
2025-12-04T11:11:09.6335905Z  * [new branch]                gh/shunting314/145/head -> origin/gh/shunting314/145/head
2025-12-04T11:11:09.6335983Z  * [new branch]                gh/shunting314/145/orig -> origin/gh/shunting314/145/orig
2025-12-04T11:11:09.6336062Z  * [new branch]                gh/shunting314/176/base -> origin/gh/shunting314/176/base
2025-12-04T11:11:09.6336142Z  * [new branch]                gh/shunting314/176/head -> origin/gh/shunting314/176/head
2025-12-04T11:11:09.6336218Z  * [new branch]                gh/shunting314/176/orig -> origin/gh/shunting314/176/orig
2025-12-04T11:11:09.6336294Z  * [new branch]                gh/shunting314/249/base -> origin/gh/shunting314/249/base
2025-12-04T11:11:09.6336375Z  * [new branch]                gh/shunting314/249/head -> origin/gh/shunting314/249/head
2025-12-04T11:11:09.6336452Z  * [new branch]                gh/shunting314/249/orig -> origin/gh/shunting314/249/orig
2025-12-04T11:11:09.6336528Z  * [new branch]                gh/shunting314/253/base -> origin/gh/shunting314/253/base
2025-12-04T11:11:09.6336607Z  * [new branch]                gh/shunting314/253/head -> origin/gh/shunting314/253/head
2025-12-04T11:11:09.6336684Z  * [new branch]                gh/shunting314/253/orig -> origin/gh/shunting314/253/orig
2025-12-04T11:11:09.6336906Z  * [new branch]                gh/shunting314/256/base -> origin/gh/shunting314/256/base
2025-12-04T11:11:09.6336984Z  * [new branch]                gh/shunting314/256/head -> origin/gh/shunting314/256/head
2025-12-04T11:11:09.6337062Z  * [new branch]                gh/shunting314/256/orig -> origin/gh/shunting314/256/orig
2025-12-04T11:11:09.6337142Z  * [new branch]                gh/shunting314/257/base -> origin/gh/shunting314/257/base
2025-12-04T11:11:09.6337219Z  * [new branch]                gh/shunting314/257/head -> origin/gh/shunting314/257/head
2025-12-04T11:11:09.6337295Z  * [new branch]                gh/shunting314/257/orig -> origin/gh/shunting314/257/orig
2025-12-04T11:11:09.6337378Z  * [new branch]                gh/shunting314/258/base -> origin/gh/shunting314/258/base
2025-12-04T11:11:09.6337454Z  * [new branch]                gh/shunting314/258/head -> origin/gh/shunting314/258/head
2025-12-04T11:11:09.6337531Z  * [new branch]                gh/shunting314/258/orig -> origin/gh/shunting314/258/orig
2025-12-04T11:11:09.6337611Z  * [new branch]                gh/shunting314/259/base -> origin/gh/shunting314/259/base
2025-12-04T11:11:09.6337688Z  * [new branch]                gh/shunting314/259/head -> origin/gh/shunting314/259/head
2025-12-04T11:11:09.6337765Z  * [new branch]                gh/shunting314/259/orig -> origin/gh/shunting314/259/orig
2025-12-04T11:11:09.6337871Z  * [new branch]                gh/shunting314/260/base -> origin/gh/shunting314/260/base
2025-12-04T11:11:09.6337949Z  * [new branch]                gh/shunting314/260/head -> origin/gh/shunting314/260/head
2025-12-04T11:11:09.6338026Z  * [new branch]                gh/shunting314/260/orig -> origin/gh/shunting314/260/orig
2025-12-04T11:11:09.6338142Z  * [new branch]                gh/shunting314/261/base -> origin/gh/shunting314/261/base
2025-12-04T11:11:09.6338254Z  * [new branch]                gh/shunting314/261/head -> origin/gh/shunting314/261/head
2025-12-04T11:11:09.6338331Z  * [new branch]                gh/shunting314/261/orig -> origin/gh/shunting314/261/orig
2025-12-04T11:11:09.6338412Z  * [new branch]                gh/shunting314/262/base -> origin/gh/shunting314/262/base
2025-12-04T11:11:09.6338490Z  * [new branch]                gh/shunting314/262/head -> origin/gh/shunting314/262/head
2025-12-04T11:11:09.6338572Z  * [new branch]                gh/shunting314/262/orig -> origin/gh/shunting314/262/orig
2025-12-04T11:11:09.6338656Z  * [new branch]                gh/shunting314/263/base -> origin/gh/shunting314/263/base
2025-12-04T11:11:09.6338734Z  * [new branch]                gh/shunting314/263/head -> origin/gh/shunting314/263/head
2025-12-04T11:11:09.6338816Z  * [new branch]                gh/shunting314/263/orig -> origin/gh/shunting314/263/orig
2025-12-04T11:11:09.6338894Z  * [new branch]                gh/shunting314/264/base -> origin/gh/shunting314/264/base
2025-12-04T11:11:09.6338971Z  * [new branch]                gh/shunting314/264/head -> origin/gh/shunting314/264/head
2025-12-04T11:11:09.6339050Z  * [new branch]                gh/shunting314/264/orig -> origin/gh/shunting314/264/orig
2025-12-04T11:11:09.6339126Z  * [new branch]                gh/shunting314/265/base -> origin/gh/shunting314/265/base
2025-12-04T11:11:09.6339203Z  * [new branch]                gh/shunting314/265/head -> origin/gh/shunting314/265/head
2025-12-04T11:11:09.6339285Z  * [new branch]                gh/shunting314/265/orig -> origin/gh/shunting314/265/orig
2025-12-04T11:11:09.6339364Z  * [new branch]                gh/shunting314/266/base -> origin/gh/shunting314/266/base
2025-12-04T11:11:09.6339441Z  * [new branch]                gh/shunting314/266/head -> origin/gh/shunting314/266/head
2025-12-04T11:11:09.6339525Z  * [new branch]                gh/shunting314/266/orig -> origin/gh/shunting314/266/orig
2025-12-04T11:11:09.6339603Z  * [new branch]                gh/shunting314/267/base -> origin/gh/shunting314/267/base
2025-12-04T11:11:09.6339681Z  * [new branch]                gh/shunting314/267/head -> origin/gh/shunting314/267/head
2025-12-04T11:11:09.6339762Z  * [new branch]                gh/shunting314/267/orig -> origin/gh/shunting314/267/orig
2025-12-04T11:11:09.6339840Z  * [new branch]                gh/shunting314/268/base -> origin/gh/shunting314/268/base
2025-12-04T11:11:09.6339917Z  * [new branch]                gh/shunting314/268/head -> origin/gh/shunting314/268/head
2025-12-04T11:11:09.6340000Z  * [new branch]                gh/shunting314/268/orig -> origin/gh/shunting314/268/orig
2025-12-04T11:11:09.6340079Z  * [new branch]                gh/shunting314/269/base -> origin/gh/shunting314/269/base
2025-12-04T11:11:09.6340157Z  * [new branch]                gh/shunting314/269/head -> origin/gh/shunting314/269/head
2025-12-04T11:11:09.6340240Z  * [new branch]                gh/shunting314/269/orig -> origin/gh/shunting314/269/orig
2025-12-04T11:11:09.6340316Z  * [new branch]                gh/silverguo/1/base     -> origin/gh/silverguo/1/base
2025-12-04T11:11:09.6340394Z  * [new branch]                gh/silverguo/1/head     -> origin/gh/silverguo/1/head
2025-12-04T11:11:09.6340470Z  * [new branch]                gh/silverguo/2/base     -> origin/gh/silverguo/2/base
2025-12-04T11:11:09.6340545Z  * [new branch]                gh/silverguo/2/head     -> origin/gh/silverguo/2/head
2025-12-04T11:11:09.6340625Z  * [new branch]                gh/silverguo/3/base     -> origin/gh/silverguo/3/base
2025-12-04T11:11:09.6340725Z  * [new branch]                gh/silverguo/3/head     -> origin/gh/silverguo/3/head
2025-12-04T11:11:09.6340799Z  * [new branch]                gh/silverguo/4/base     -> origin/gh/silverguo/4/base
2025-12-04T11:11:09.6340876Z  * [new branch]                gh/silverguo/4/head     -> origin/gh/silverguo/4/head
2025-12-04T11:11:09.6340978Z  * [new branch]                gh/slayton58/39/base    -> origin/gh/slayton58/39/base
2025-12-04T11:11:09.6341054Z  * [new branch]                gh/slayton58/39/head    -> origin/gh/slayton58/39/head
2025-12-04T11:11:09.6341133Z  * [new branch]                gh/slayton58/39/orig    -> origin/gh/slayton58/39/orig
2025-12-04T11:11:09.6341207Z  * [new branch]                gh/slayton58/42/base    -> origin/gh/slayton58/42/base
2025-12-04T11:11:09.6341280Z  * [new branch]                gh/slayton58/42/head    -> origin/gh/slayton58/42/head
2025-12-04T11:11:09.6341358Z  * [new branch]                gh/slayton58/42/orig    -> origin/gh/slayton58/42/orig
2025-12-04T11:11:09.6341432Z  * [new branch]                gh/slayton58/43/base    -> origin/gh/slayton58/43/base
2025-12-04T11:11:09.6341507Z  * [new branch]                gh/slayton58/43/head    -> origin/gh/slayton58/43/head
2025-12-04T11:11:09.6341583Z  * [new branch]                gh/slayton58/43/orig    -> origin/gh/slayton58/43/orig
2025-12-04T11:11:09.6341658Z  * [new branch]                gh/slayton58/44/base    -> origin/gh/slayton58/44/base
2025-12-04T11:11:09.6341731Z  * [new branch]                gh/slayton58/44/head    -> origin/gh/slayton58/44/head
2025-12-04T11:11:09.6341808Z  * [new branch]                gh/slayton58/44/orig    -> origin/gh/slayton58/44/orig
2025-12-04T11:11:09.6341881Z  * [new branch]                gh/slayton58/45/base    -> origin/gh/slayton58/45/base
2025-12-04T11:11:09.6341959Z  * [new branch]                gh/slayton58/45/head    -> origin/gh/slayton58/45/head
2025-12-04T11:11:09.6342033Z  * [new branch]                gh/slayton58/45/orig    -> origin/gh/slayton58/45/orig
2025-12-04T11:11:09.6342106Z  * [new branch]                gh/slayton58/46/base    -> origin/gh/slayton58/46/base
2025-12-04T11:11:09.6342183Z  * [new branch]                gh/slayton58/46/head    -> origin/gh/slayton58/46/head
2025-12-04T11:11:09.6342255Z  * [new branch]                gh/slayton58/46/orig    -> origin/gh/slayton58/46/orig
2025-12-04T11:11:09.6342331Z  * [new branch]                gh/slayton58/6/base     -> origin/gh/slayton58/6/base
2025-12-04T11:11:09.6342407Z  * [new branch]                gh/slayton58/6/head     -> origin/gh/slayton58/6/head
2025-12-04T11:11:09.6342480Z  * [new branch]                gh/slayton58/7/base     -> origin/gh/slayton58/7/base
2025-12-04T11:11:09.6342552Z  * [new branch]                gh/slayton58/7/head     -> origin/gh/slayton58/7/head
2025-12-04T11:11:09.6342633Z  * [new branch]                gh/soulitzer/269/base   -> origin/gh/soulitzer/269/base
2025-12-04T11:11:09.6342710Z  * [new branch]                gh/soulitzer/269/head   -> origin/gh/soulitzer/269/head
2025-12-04T11:11:09.6342787Z  * [new branch]                gh/soulitzer/269/orig   -> origin/gh/soulitzer/269/orig
2025-12-04T11:11:09.6342865Z  * [new branch]                gh/soulitzer/276/base   -> origin/gh/soulitzer/276/base
2025-12-04T11:11:09.6342940Z  * [new branch]                gh/soulitzer/276/head   -> origin/gh/soulitzer/276/head
2025-12-04T11:11:09.6343017Z  * [new branch]                gh/soulitzer/276/orig   -> origin/gh/soulitzer/276/orig
2025-12-04T11:11:09.6343097Z  * [new branch]                gh/soulitzer/287/base   -> origin/gh/soulitzer/287/base
2025-12-04T11:11:09.6343172Z  * [new branch]                gh/soulitzer/287/head   -> origin/gh/soulitzer/287/head
2025-12-04T11:11:09.6343247Z  * [new branch]                gh/soulitzer/287/orig   -> origin/gh/soulitzer/287/orig
2025-12-04T11:11:09.6343325Z  * [new branch]                gh/soulitzer/296/base   -> origin/gh/soulitzer/296/base
2025-12-04T11:11:09.6343401Z  * [new branch]                gh/soulitzer/296/head   -> origin/gh/soulitzer/296/head
2025-12-04T11:11:09.6343500Z  * [new branch]                gh/soulitzer/296/orig   -> origin/gh/soulitzer/296/orig
2025-12-04T11:11:09.6343581Z  * [new branch]                gh/soulitzer/299/base   -> origin/gh/soulitzer/299/base
2025-12-04T11:11:09.6343678Z  * [new branch]                gh/soulitzer/299/head   -> origin/gh/soulitzer/299/head
2025-12-04T11:11:09.6343756Z  * [new branch]                gh/soulitzer/299/orig   -> origin/gh/soulitzer/299/orig
2025-12-04T11:11:09.6343830Z  * [new branch]                gh/soulitzer/300/base   -> origin/gh/soulitzer/300/base
2025-12-04T11:11:09.6343906Z  * [new branch]                gh/soulitzer/300/head   -> origin/gh/soulitzer/300/head
2025-12-04T11:11:09.6343984Z  * [new branch]                gh/soulitzer/300/orig   -> origin/gh/soulitzer/300/orig
2025-12-04T11:11:09.6344057Z  * [new branch]                gh/soulitzer/301/base   -> origin/gh/soulitzer/301/base
2025-12-04T11:11:09.6344132Z  * [new branch]                gh/soulitzer/301/head   -> origin/gh/soulitzer/301/head
2025-12-04T11:11:09.6344211Z  * [new branch]                gh/soulitzer/301/orig   -> origin/gh/soulitzer/301/orig
2025-12-04T11:11:09.6344286Z  * [new branch]                gh/soulitzer/313/base   -> origin/gh/soulitzer/313/base
2025-12-04T11:11:09.6344362Z  * [new branch]                gh/soulitzer/313/head   -> origin/gh/soulitzer/313/head
2025-12-04T11:11:09.6344440Z  * [new branch]                gh/soulitzer/313/orig   -> origin/gh/soulitzer/313/orig
2025-12-04T11:11:09.6344514Z  * [new branch]                gh/soulitzer/319/base   -> origin/gh/soulitzer/319/base
2025-12-04T11:11:09.6344589Z  * [new branch]                gh/soulitzer/319/head   -> origin/gh/soulitzer/319/head
2025-12-04T11:11:09.6344669Z  * [new branch]                gh/soulitzer/319/orig   -> origin/gh/soulitzer/319/orig
2025-12-04T11:11:09.6344744Z  * [new branch]                gh/soulitzer/320/base   -> origin/gh/soulitzer/320/base
2025-12-04T11:11:09.6344820Z  * [new branch]                gh/soulitzer/320/head   -> origin/gh/soulitzer/320/head
2025-12-04T11:11:09.6344901Z  * [new branch]                gh/soulitzer/320/orig   -> origin/gh/soulitzer/320/orig
2025-12-04T11:11:09.6344976Z  * [new branch]                gh/soulitzer/336/base   -> origin/gh/soulitzer/336/base
2025-12-04T11:11:09.6345053Z  * [new branch]                gh/soulitzer/336/head   -> origin/gh/soulitzer/336/head
2025-12-04T11:11:09.6345133Z  * [new branch]                gh/soulitzer/336/orig   -> origin/gh/soulitzer/336/orig
2025-12-04T11:11:09.6345204Z  * [new branch]                gh/soulitzer/347/base   -> origin/gh/soulitzer/347/base
2025-12-04T11:11:09.6345284Z  * [new branch]                gh/soulitzer/347/head   -> origin/gh/soulitzer/347/head
2025-12-04T11:11:09.6345360Z  * [new branch]                gh/soulitzer/347/orig   -> origin/gh/soulitzer/347/orig
2025-12-04T11:11:09.6345436Z  * [new branch]                gh/soulitzer/349/base   -> origin/gh/soulitzer/349/base
2025-12-04T11:11:09.6345517Z  * [new branch]                gh/soulitzer/349/head   -> origin/gh/soulitzer/349/head
2025-12-04T11:11:09.6345593Z  * [new branch]                gh/soulitzer/349/orig   -> origin/gh/soulitzer/349/orig
2025-12-04T11:11:09.6345668Z  * [new branch]                gh/soulitzer/350/base   -> origin/gh/soulitzer/350/base
2025-12-04T11:11:09.6345749Z  * [new branch]                gh/soulitzer/350/head   -> origin/gh/soulitzer/350/head
2025-12-04T11:11:09.6345824Z  * [new branch]                gh/soulitzer/350/orig   -> origin/gh/soulitzer/350/orig
2025-12-04T11:11:09.6345898Z  * [new branch]                gh/soulitzer/351/base   -> origin/gh/soulitzer/351/base
2025-12-04T11:11:09.6345978Z  * [new branch]                gh/soulitzer/351/head   -> origin/gh/soulitzer/351/head
2025-12-04T11:11:09.6346053Z  * [new branch]                gh/soulitzer/351/orig   -> origin/gh/soulitzer/351/orig
2025-12-04T11:11:09.6346128Z  * [new branch]                gh/soulitzer/353/base   -> origin/gh/soulitzer/353/base
2025-12-04T11:11:09.6346236Z  * [new branch]                gh/soulitzer/353/head   -> origin/gh/soulitzer/353/head
2025-12-04T11:11:09.6346312Z  * [new branch]                gh/soulitzer/353/orig   -> origin/gh/soulitzer/353/orig
2025-12-04T11:11:09.6346390Z  * [new branch]                gh/soulitzer/358/base   -> origin/gh/soulitzer/358/base
2025-12-04T11:11:09.6346489Z  * [new branch]                gh/soulitzer/358/head   -> origin/gh/soulitzer/358/head
2025-12-04T11:11:09.6346564Z  * [new branch]                gh/soulitzer/358/orig   -> origin/gh/soulitzer/358/orig
2025-12-04T11:11:09.6346639Z  * [new branch]                gh/soulitzer/359/base   -> origin/gh/soulitzer/359/base
2025-12-04T11:11:09.6346718Z  * [new branch]                gh/soulitzer/359/head   -> origin/gh/soulitzer/359/head
2025-12-04T11:11:09.6346794Z  * [new branch]                gh/soulitzer/359/orig   -> origin/gh/soulitzer/359/orig
2025-12-04T11:11:09.6346869Z  * [new branch]                gh/soulitzer/374/base   -> origin/gh/soulitzer/374/base
2025-12-04T11:11:09.6346949Z  * [new branch]                gh/soulitzer/374/head   -> origin/gh/soulitzer/374/head
2025-12-04T11:11:09.6347024Z  * [new branch]                gh/soulitzer/374/orig   -> origin/gh/soulitzer/374/orig
2025-12-04T11:11:09.6347104Z  * [new branch]                gh/soulitzer/375/base   -> origin/gh/soulitzer/375/base
2025-12-04T11:11:09.6347180Z  * [new branch]                gh/soulitzer/375/head   -> origin/gh/soulitzer/375/head
2025-12-04T11:11:09.6347255Z  * [new branch]                gh/soulitzer/375/orig   -> origin/gh/soulitzer/375/orig
2025-12-04T11:11:09.6347335Z  * [new branch]                gh/soulitzer/380/base   -> origin/gh/soulitzer/380/base
2025-12-04T11:11:09.6347410Z  * [new branch]                gh/soulitzer/380/head   -> origin/gh/soulitzer/380/head
2025-12-04T11:11:09.6347486Z  * [new branch]                gh/soulitzer/380/orig   -> origin/gh/soulitzer/380/orig
2025-12-04T11:11:09.6347565Z  * [new branch]                gh/soulitzer/385/base   -> origin/gh/soulitzer/385/base
2025-12-04T11:11:09.6347642Z  * [new branch]                gh/soulitzer/385/head   -> origin/gh/soulitzer/385/head
2025-12-04T11:11:09.6347717Z  * [new branch]                gh/soulitzer/385/orig   -> origin/gh/soulitzer/385/orig
2025-12-04T11:11:09.6347798Z  * [new branch]                gh/soulitzer/386/base   -> origin/gh/soulitzer/386/base
2025-12-04T11:11:09.6347872Z  * [new branch]                gh/soulitzer/386/head   -> origin/gh/soulitzer/386/head
2025-12-04T11:11:09.6347946Z  * [new branch]                gh/soulitzer/386/orig   -> origin/gh/soulitzer/386/orig
2025-12-04T11:11:09.6348025Z  * [new branch]                gh/soulitzer/387/base   -> origin/gh/soulitzer/387/base
2025-12-04T11:11:09.6348100Z  * [new branch]                gh/soulitzer/387/head   -> origin/gh/soulitzer/387/head
2025-12-04T11:11:09.6348213Z  * [new branch]                gh/soulitzer/387/orig   -> origin/gh/soulitzer/387/orig
2025-12-04T11:11:09.6348293Z  * [new branch]                gh/soulitzer/388/base   -> origin/gh/soulitzer/388/base
2025-12-04T11:11:09.6348369Z  * [new branch]                gh/soulitzer/388/head   -> origin/gh/soulitzer/388/head
2025-12-04T11:11:09.6348445Z  * [new branch]                gh/soulitzer/388/orig   -> origin/gh/soulitzer/388/orig
2025-12-04T11:11:09.6348525Z  * [new branch]                gh/soulitzer/389/base   -> origin/gh/soulitzer/389/base
2025-12-04T11:11:09.6348600Z  * [new branch]                gh/soulitzer/389/head   -> origin/gh/soulitzer/389/head
2025-12-04T11:11:09.6348677Z  * [new branch]                gh/soulitzer/389/orig   -> origin/gh/soulitzer/389/orig
2025-12-04T11:11:09.6348752Z  * [new branch]                gh/soulitzer/390/base   -> origin/gh/soulitzer/390/base
2025-12-04T11:11:09.6348826Z  * [new branch]                gh/soulitzer/390/head   -> origin/gh/soulitzer/390/head
2025-12-04T11:11:09.6348906Z  * [new branch]                gh/soulitzer/390/orig   -> origin/gh/soulitzer/390/orig
2025-12-04T11:11:09.6348981Z  * [new branch]                gh/soulitzer/391/base   -> origin/gh/soulitzer/391/base
2025-12-04T11:11:09.6349096Z  * [new branch]                gh/soulitzer/391/head   -> origin/gh/soulitzer/391/head
2025-12-04T11:11:09.6349176Z  * [new branch]                gh/soulitzer/391/orig   -> origin/gh/soulitzer/391/orig
2025-12-04T11:11:09.6349281Z  * [new branch]                gh/soulitzer/392/base   -> origin/gh/soulitzer/392/base
2025-12-04T11:11:09.6349356Z  * [new branch]                gh/soulitzer/392/head   -> origin/gh/soulitzer/392/head
2025-12-04T11:11:09.6349433Z  * [new branch]                gh/soulitzer/392/orig   -> origin/gh/soulitzer/392/orig
2025-12-04T11:11:09.6349508Z  * [new branch]                gh/swolchok/728/next    -> origin/gh/swolchok/728/next
2025-12-04T11:11:09.6349583Z  * [new branch]                gh/swolchok/819/base    -> origin/gh/swolchok/819/base
2025-12-04T11:11:09.6349660Z  * [new branch]                gh/swolchok/819/head    -> origin/gh/swolchok/819/head
2025-12-04T11:11:09.6349736Z  * [new branch]                gh/swolchok/819/orig    -> origin/gh/swolchok/819/orig
2025-12-04T11:11:09.6349809Z  * [new branch]                gh/swolchok/824/base    -> origin/gh/swolchok/824/base
2025-12-04T11:11:09.6349887Z  * [new branch]                gh/swolchok/824/head    -> origin/gh/swolchok/824/head
2025-12-04T11:11:09.6349962Z  * [new branch]                gh/swolchok/824/orig    -> origin/gh/swolchok/824/orig
2025-12-04T11:11:09.6350035Z  * [new branch]                gh/swolchok/829/base    -> origin/gh/swolchok/829/base
2025-12-04T11:11:09.6350113Z  * [new branch]                gh/swolchok/829/head    -> origin/gh/swolchok/829/head
2025-12-04T11:11:09.6350187Z  * [new branch]                gh/swolchok/829/orig    -> origin/gh/swolchok/829/orig
2025-12-04T11:11:09.6350266Z  * [new branch]                gh/swolchok/839/base    -> origin/gh/swolchok/839/base
2025-12-04T11:11:09.6350338Z  * [new branch]                gh/swolchok/839/head    -> origin/gh/swolchok/839/head
2025-12-04T11:11:09.6350413Z  * [new branch]                gh/swolchok/839/orig    -> origin/gh/swolchok/839/orig
2025-12-04T11:11:09.6350491Z  * [new branch]                gh/swolchok/841/base    -> origin/gh/swolchok/841/base
2025-12-04T11:11:09.6350564Z  * [new branch]                gh/swolchok/841/head    -> origin/gh/swolchok/841/head
2025-12-04T11:11:09.6350641Z  * [new branch]                gh/swolchok/841/orig    -> origin/gh/swolchok/841/orig
2025-12-04T11:11:09.6350719Z  * [new branch]                gh/swolchok/842/base    -> origin/gh/swolchok/842/base
2025-12-04T11:11:09.6350792Z  * [new branch]                gh/swolchok/842/head    -> origin/gh/swolchok/842/head
2025-12-04T11:11:09.6350865Z  * [new branch]                gh/swolchok/842/orig    -> origin/gh/swolchok/842/orig
2025-12-04T11:11:09.6350941Z  * [new branch]                gh/swolchok/845/base    -> origin/gh/swolchok/845/base
2025-12-04T11:11:09.6351014Z  * [new branch]                gh/swolchok/845/head    -> origin/gh/swolchok/845/head
2025-12-04T11:11:09.6351089Z  * [new branch]                gh/swolchok/845/orig    -> origin/gh/swolchok/845/orig
2025-12-04T11:11:09.6351165Z  * [new branch]                gh/swolchok/848/base    -> origin/gh/swolchok/848/base
2025-12-04T11:11:09.6351239Z  * [new branch]                gh/swolchok/848/head    -> origin/gh/swolchok/848/head
2025-12-04T11:11:09.6351314Z  * [new branch]                gh/swolchok/848/orig    -> origin/gh/swolchok/848/orig
2025-12-04T11:11:09.6351392Z  * [new branch]                gh/swolchok/856/base    -> origin/gh/swolchok/856/base
2025-12-04T11:11:09.6351465Z  * [new branch]                gh/swolchok/856/head    -> origin/gh/swolchok/856/head
2025-12-04T11:11:09.6351538Z  * [new branch]                gh/swolchok/856/orig    -> origin/gh/swolchok/856/orig
2025-12-04T11:11:09.6351615Z  * [new branch]                gh/swolchok/860/base    -> origin/gh/swolchok/860/base
2025-12-04T11:11:09.6351687Z  * [new branch]                gh/swolchok/860/head    -> origin/gh/swolchok/860/head
2025-12-04T11:11:09.6351786Z  * [new branch]                gh/swolchok/860/orig    -> origin/gh/swolchok/860/orig
2025-12-04T11:11:09.6351860Z  * [new branch]                gh/swolchok/861/base    -> origin/gh/swolchok/861/base
2025-12-04T11:11:09.6351932Z  * [new branch]                gh/swolchok/861/head    -> origin/gh/swolchok/861/head
2025-12-04T11:11:09.6352031Z  * [new branch]                gh/swolchok/861/orig    -> origin/gh/swolchok/861/orig
2025-12-04T11:11:09.6352104Z  * [new branch]                gh/swolchok/862/base    -> origin/gh/swolchok/862/base
2025-12-04T11:11:09.6352177Z  * [new branch]                gh/swolchok/862/head    -> origin/gh/swolchok/862/head
2025-12-04T11:11:09.6352251Z  * [new branch]                gh/swolchok/862/orig    -> origin/gh/swolchok/862/orig
2025-12-04T11:11:09.6352323Z  * [new branch]                gh/swolchok/863/base    -> origin/gh/swolchok/863/base
2025-12-04T11:11:09.6352394Z  * [new branch]                gh/swolchok/863/head    -> origin/gh/swolchok/863/head
2025-12-04T11:11:09.6352469Z  * [new branch]                gh/swolchok/863/orig    -> origin/gh/swolchok/863/orig
2025-12-04T11:11:09.6352541Z  * [new branch]                gh/swolchok/864/base    -> origin/gh/swolchok/864/base
2025-12-04T11:11:09.6352614Z  * [new branch]                gh/swolchok/864/head    -> origin/gh/swolchok/864/head
2025-12-04T11:11:09.6352689Z  * [new branch]                gh/swolchok/864/orig    -> origin/gh/swolchok/864/orig
2025-12-04T11:11:09.6352760Z  * [new branch]                gh/swolchok/865/base    -> origin/gh/swolchok/865/base
2025-12-04T11:11:09.6352832Z  * [new branch]                gh/swolchok/865/head    -> origin/gh/swolchok/865/head
2025-12-04T11:11:09.6352908Z  * [new branch]                gh/swolchok/865/orig    -> origin/gh/swolchok/865/orig
2025-12-04T11:11:09.6352980Z  * [new branch]                gh/swolchok/866/base    -> origin/gh/swolchok/866/base
2025-12-04T11:11:09.6353052Z  * [new branch]                gh/swolchok/866/head    -> origin/gh/swolchok/866/head
2025-12-04T11:11:09.6353127Z  * [new branch]                gh/swolchok/866/orig    -> origin/gh/swolchok/866/orig
2025-12-04T11:11:09.6353201Z  * [new branch]                gh/swolchok/867/base    -> origin/gh/swolchok/867/base
2025-12-04T11:11:09.6353276Z  * [new branch]                gh/swolchok/867/head    -> origin/gh/swolchok/867/head
2025-12-04T11:11:09.6353350Z  * [new branch]                gh/swolchok/867/orig    -> origin/gh/swolchok/867/orig
2025-12-04T11:11:09.6353423Z  * [new branch]                gh/swolchok/868/base    -> origin/gh/swolchok/868/base
2025-12-04T11:11:09.6353497Z  * [new branch]                gh/swolchok/868/head    -> origin/gh/swolchok/868/head
2025-12-04T11:11:09.6353569Z  * [new branch]                gh/swolchok/868/orig    -> origin/gh/swolchok/868/orig
2025-12-04T11:11:09.6353642Z  * [new branch]                gh/swolchok/869/base    -> origin/gh/swolchok/869/base
2025-12-04T11:11:09.6353716Z  * [new branch]                gh/swolchok/869/head    -> origin/gh/swolchok/869/head
2025-12-04T11:11:09.6353788Z  * [new branch]                gh/swolchok/869/orig    -> origin/gh/swolchok/869/orig
2025-12-04T11:11:09.6353859Z  * [new branch]                gh/swolchok/870/base    -> origin/gh/swolchok/870/base
2025-12-04T11:11:09.6353936Z  * [new branch]                gh/swolchok/870/head    -> origin/gh/swolchok/870/head
2025-12-04T11:11:09.6354007Z  * [new branch]                gh/swolchok/870/orig    -> origin/gh/swolchok/870/orig
2025-12-04T11:11:09.6354079Z  * [new branch]                gh/swolchok/871/base    -> origin/gh/swolchok/871/base
2025-12-04T11:11:09.6354154Z  * [new branch]                gh/swolchok/871/head    -> origin/gh/swolchok/871/head
2025-12-04T11:11:09.6354226Z  * [new branch]                gh/swolchok/871/orig    -> origin/gh/swolchok/871/orig
2025-12-04T11:11:09.6354300Z  * [new branch]                gh/teja-rao/4/base      -> origin/gh/teja-rao/4/base
2025-12-04T11:11:09.6354380Z  * [new branch]                gh/teja-rao/4/head      -> origin/gh/teja-rao/4/head
2025-12-04T11:11:09.6354473Z  * [new branch]                gh/teja-rao/4/orig      -> origin/gh/teja-rao/4/orig
2025-12-04T11:11:09.6354547Z  * [new branch]                gh/tianyu-l/2/base      -> origin/gh/tianyu-l/2/base
2025-12-04T11:11:09.6354645Z  * [new branch]                gh/tianyu-l/2/head      -> origin/gh/tianyu-l/2/head
2025-12-04T11:11:09.6354715Z  * [new branch]                gh/tianyu-l/2/orig      -> origin/gh/tianyu-l/2/orig
2025-12-04T11:11:09.6354784Z  * [new branch]                gh/tianyu-l/3/base      -> origin/gh/tianyu-l/3/base
2025-12-04T11:11:09.6354856Z  * [new branch]                gh/tianyu-l/3/orig      -> origin/gh/tianyu-l/3/orig
2025-12-04T11:11:09.6354925Z  * [new branch]                gh/tianyu-l/4/base      -> origin/gh/tianyu-l/4/base
2025-12-04T11:11:09.6354996Z  * [new branch]                gh/tianyu-l/4/head      -> origin/gh/tianyu-l/4/head
2025-12-04T11:11:09.6355065Z  * [new branch]                gh/tianyu-l/4/orig      -> origin/gh/tianyu-l/4/orig
2025-12-04T11:11:09.6355158Z  * [new branch]                gh/tugsbayasgalan/10/base -> origin/gh/tugsbayasgalan/10/base
2025-12-04T11:11:09.6355250Z  * [new branch]                gh/tugsbayasgalan/10/head -> origin/gh/tugsbayasgalan/10/head
2025-12-04T11:11:09.6355339Z  * [new branch]                gh/tugsbayasgalan/10/orig -> origin/gh/tugsbayasgalan/10/orig
2025-12-04T11:11:09.6355424Z  * [new branch]                gh/tugsbayasgalan/13/base -> origin/gh/tugsbayasgalan/13/base
2025-12-04T11:11:09.6355513Z  * [new branch]                gh/tugsbayasgalan/13/head -> origin/gh/tugsbayasgalan/13/head
2025-12-04T11:11:09.6355598Z  * [new branch]                gh/tugsbayasgalan/13/orig -> origin/gh/tugsbayasgalan/13/orig
2025-12-04T11:11:09.6355684Z  * [new branch]                gh/tugsbayasgalan/17/base -> origin/gh/tugsbayasgalan/17/base
2025-12-04T11:11:09.6355772Z  * [new branch]                gh/tugsbayasgalan/17/head -> origin/gh/tugsbayasgalan/17/head
2025-12-04T11:11:09.6355858Z  * [new branch]                gh/tugsbayasgalan/17/orig -> origin/gh/tugsbayasgalan/17/orig
2025-12-04T11:11:09.6355945Z  * [new branch]                gh/tugsbayasgalan/2/base -> origin/gh/tugsbayasgalan/2/base
2025-12-04T11:11:09.6356031Z  * [new branch]                gh/tugsbayasgalan/2/head -> origin/gh/tugsbayasgalan/2/head
2025-12-04T11:11:09.6356117Z  * [new branch]                gh/tugsbayasgalan/2/orig -> origin/gh/tugsbayasgalan/2/orig
2025-12-04T11:11:09.6356203Z  * [new branch]                gh/tugsbayasgalan/28/base -> origin/gh/tugsbayasgalan/28/base
2025-12-04T11:11:09.6356292Z  * [new branch]                gh/tugsbayasgalan/28/head -> origin/gh/tugsbayasgalan/28/head
2025-12-04T11:11:09.6356377Z  * [new branch]                gh/tugsbayasgalan/28/orig -> origin/gh/tugsbayasgalan/28/orig
2025-12-04T11:11:09.6356462Z  * [new branch]                gh/tugsbayasgalan/32/base -> origin/gh/tugsbayasgalan/32/base
2025-12-04T11:11:09.6356550Z  * [new branch]                gh/tugsbayasgalan/32/head -> origin/gh/tugsbayasgalan/32/head
2025-12-04T11:11:09.6356637Z  * [new branch]                gh/tugsbayasgalan/32/orig -> origin/gh/tugsbayasgalan/32/orig
2025-12-04T11:11:09.6356726Z  * [new branch]                gh/tugsbayasgalan/35/base -> origin/gh/tugsbayasgalan/35/base
2025-12-04T11:11:09.6356816Z  * [new branch]                gh/tugsbayasgalan/35/head -> origin/gh/tugsbayasgalan/35/head
2025-12-04T11:11:09.6356903Z  * [new branch]                gh/tugsbayasgalan/35/orig -> origin/gh/tugsbayasgalan/35/orig
2025-12-04T11:11:09.6356993Z  * [new branch]                gh/tugsbayasgalan/36/base -> origin/gh/tugsbayasgalan/36/base
2025-12-04T11:11:09.6357077Z  * [new branch]                gh/tugsbayasgalan/36/head -> origin/gh/tugsbayasgalan/36/head
2025-12-04T11:11:09.6357162Z  * [new branch]                gh/tugsbayasgalan/36/orig -> origin/gh/tugsbayasgalan/36/orig
2025-12-04T11:11:09.6357250Z  * [new branch]                gh/tugsbayasgalan/37/base -> origin/gh/tugsbayasgalan/37/base
2025-12-04T11:11:09.6357358Z  * [new branch]                gh/tugsbayasgalan/37/head -> origin/gh/tugsbayasgalan/37/head
2025-12-04T11:11:09.6357444Z  * [new branch]                gh/tugsbayasgalan/37/orig -> origin/gh/tugsbayasgalan/37/orig
2025-12-04T11:11:09.6357554Z  * [new branch]                gh/tugsbayasgalan/43/base -> origin/gh/tugsbayasgalan/43/base
2025-12-04T11:11:09.6357639Z  * [new branch]                gh/tugsbayasgalan/43/head -> origin/gh/tugsbayasgalan/43/head
2025-12-04T11:11:09.6357723Z  * [new branch]                gh/tugsbayasgalan/43/orig -> origin/gh/tugsbayasgalan/43/orig
2025-12-04T11:11:09.6357812Z  * [new branch]                gh/tugsbayasgalan/48/base -> origin/gh/tugsbayasgalan/48/base
2025-12-04T11:11:09.6357896Z  * [new branch]                gh/tugsbayasgalan/48/head -> origin/gh/tugsbayasgalan/48/head
2025-12-04T11:11:09.6357981Z  * [new branch]                gh/tugsbayasgalan/48/orig -> origin/gh/tugsbayasgalan/48/orig
2025-12-04T11:11:09.6358074Z  * [new branch]                gh/tugsbayasgalan/51/base -> origin/gh/tugsbayasgalan/51/base
2025-12-04T11:11:09.6358200Z  * [new branch]                gh/tugsbayasgalan/51/head -> origin/gh/tugsbayasgalan/51/head
2025-12-04T11:11:09.6358285Z  * [new branch]                gh/tugsbayasgalan/51/orig -> origin/gh/tugsbayasgalan/51/orig
2025-12-04T11:11:09.6358375Z  * [new branch]                gh/tugsbayasgalan/52/base -> origin/gh/tugsbayasgalan/52/base
2025-12-04T11:11:09.6358461Z  * [new branch]                gh/tugsbayasgalan/52/head -> origin/gh/tugsbayasgalan/52/head
2025-12-04T11:11:09.6358551Z  * [new branch]                gh/tugsbayasgalan/52/orig -> origin/gh/tugsbayasgalan/52/orig
2025-12-04T11:11:09.6358636Z  * [new branch]                gh/tugsbayasgalan/53/base -> origin/gh/tugsbayasgalan/53/base
2025-12-04T11:11:09.6358721Z  * [new branch]                gh/tugsbayasgalan/53/head -> origin/gh/tugsbayasgalan/53/head
2025-12-04T11:11:09.6358809Z  * [new branch]                gh/tugsbayasgalan/53/orig -> origin/gh/tugsbayasgalan/53/orig
2025-12-04T11:11:09.6358895Z  * [new branch]                gh/tugsbayasgalan/55/base -> origin/gh/tugsbayasgalan/55/base
2025-12-04T11:11:09.6358981Z  * [new branch]                gh/tugsbayasgalan/55/head -> origin/gh/tugsbayasgalan/55/head
2025-12-04T11:11:09.6359070Z  * [new branch]                gh/tugsbayasgalan/55/orig -> origin/gh/tugsbayasgalan/55/orig
2025-12-04T11:11:09.6359155Z  * [new branch]                gh/tugsbayasgalan/59/base -> origin/gh/tugsbayasgalan/59/base
2025-12-04T11:11:09.6359240Z  * [new branch]                gh/tugsbayasgalan/59/head -> origin/gh/tugsbayasgalan/59/head
2025-12-04T11:11:09.6359325Z  * [new branch]                gh/tugsbayasgalan/59/orig -> origin/gh/tugsbayasgalan/59/orig
2025-12-04T11:11:09.6359409Z  * [new branch]                gh/tugsbayasgalan/6/base -> origin/gh/tugsbayasgalan/6/base
2025-12-04T11:11:09.6359492Z  * [new branch]                gh/tugsbayasgalan/6/head -> origin/gh/tugsbayasgalan/6/head
2025-12-04T11:11:09.6359579Z  * [new branch]                gh/tugsbayasgalan/6/orig -> origin/gh/tugsbayasgalan/6/orig
2025-12-04T11:11:09.6359664Z  * [new branch]                gh/tugsbayasgalan/60/base -> origin/gh/tugsbayasgalan/60/base
2025-12-04T11:11:09.6359748Z  * [new branch]                gh/tugsbayasgalan/60/head -> origin/gh/tugsbayasgalan/60/head
2025-12-04T11:11:09.6359840Z  * [new branch]                gh/tugsbayasgalan/60/orig -> origin/gh/tugsbayasgalan/60/orig
2025-12-04T11:11:09.6359925Z  * [new branch]                gh/tugsbayasgalan/61/base -> origin/gh/tugsbayasgalan/61/base
2025-12-04T11:11:09.6360014Z  * [new branch]                gh/tugsbayasgalan/61/head -> origin/gh/tugsbayasgalan/61/head
2025-12-04T11:11:09.6360099Z  * [new branch]                gh/tugsbayasgalan/61/orig -> origin/gh/tugsbayasgalan/61/orig
2025-12-04T11:11:09.6360183Z  * [new branch]                gh/tugsbayasgalan/63/base -> origin/gh/tugsbayasgalan/63/base
2025-12-04T11:11:09.6360271Z  * [new branch]                gh/tugsbayasgalan/63/head -> origin/gh/tugsbayasgalan/63/head
2025-12-04T11:11:09.6360390Z  * [new branch]                gh/tugsbayasgalan/63/orig -> origin/gh/tugsbayasgalan/63/orig
2025-12-04T11:11:09.6360476Z  * [new branch]                gh/tugsbayasgalan/67/base -> origin/gh/tugsbayasgalan/67/base
2025-12-04T11:11:09.6360591Z  * [new branch]                gh/tugsbayasgalan/67/head -> origin/gh/tugsbayasgalan/67/head
2025-12-04T11:11:09.6360676Z  * [new branch]                gh/tugsbayasgalan/67/orig -> origin/gh/tugsbayasgalan/67/orig
2025-12-04T11:11:09.6360761Z  * [new branch]                gh/tugsbayasgalan/68/base -> origin/gh/tugsbayasgalan/68/base
2025-12-04T11:11:09.6360849Z  * [new branch]                gh/tugsbayasgalan/68/head -> origin/gh/tugsbayasgalan/68/head
2025-12-04T11:11:09.6360934Z  * [new branch]                gh/tugsbayasgalan/68/orig -> origin/gh/tugsbayasgalan/68/orig
2025-12-04T11:11:09.6361018Z  * [new branch]                gh/tugsbayasgalan/7/base -> origin/gh/tugsbayasgalan/7/base
2025-12-04T11:11:09.6361108Z  * [new branch]                gh/tugsbayasgalan/7/head -> origin/gh/tugsbayasgalan/7/head
2025-12-04T11:11:09.6361191Z  * [new branch]                gh/tugsbayasgalan/7/orig -> origin/gh/tugsbayasgalan/7/orig
2025-12-04T11:11:09.6361277Z  * [new branch]                gh/tugsbayasgalan/70/base -> origin/gh/tugsbayasgalan/70/base
2025-12-04T11:11:09.6361368Z  * [new branch]                gh/tugsbayasgalan/70/head -> origin/gh/tugsbayasgalan/70/head
2025-12-04T11:11:09.6361455Z  * [new branch]                gh/tugsbayasgalan/70/orig -> origin/gh/tugsbayasgalan/70/orig
2025-12-04T11:11:09.6361541Z  * [new branch]                gh/tugsbayasgalan/71/base -> origin/gh/tugsbayasgalan/71/base
2025-12-04T11:11:09.6361633Z  * [new branch]                gh/tugsbayasgalan/71/head -> origin/gh/tugsbayasgalan/71/head
2025-12-04T11:11:09.6361718Z  * [new branch]                gh/tugsbayasgalan/71/orig -> origin/gh/tugsbayasgalan/71/orig
2025-12-04T11:11:09.6361806Z  * [new branch]                gh/tugsbayasgalan/72/base -> origin/gh/tugsbayasgalan/72/base
2025-12-04T11:11:09.6361892Z  * [new branch]                gh/tugsbayasgalan/72/head -> origin/gh/tugsbayasgalan/72/head
2025-12-04T11:11:09.6361977Z  * [new branch]                gh/tugsbayasgalan/72/orig -> origin/gh/tugsbayasgalan/72/orig
2025-12-04T11:11:09.6362069Z  * [new branch]                gh/tugsbayasgalan/73/base -> origin/gh/tugsbayasgalan/73/base
2025-12-04T11:11:09.6362153Z  * [new branch]                gh/tugsbayasgalan/73/head -> origin/gh/tugsbayasgalan/73/head
2025-12-04T11:11:09.6362238Z  * [new branch]                gh/tugsbayasgalan/73/orig -> origin/gh/tugsbayasgalan/73/orig
2025-12-04T11:11:09.6362327Z  * [new branch]                gh/tugsbayasgalan/74/base -> origin/gh/tugsbayasgalan/74/base
2025-12-04T11:11:09.6362411Z  * [new branch]                gh/tugsbayasgalan/74/head -> origin/gh/tugsbayasgalan/74/head
2025-12-04T11:11:09.6362495Z  * [new branch]                gh/tugsbayasgalan/74/orig -> origin/gh/tugsbayasgalan/74/orig
2025-12-04T11:11:09.6362584Z  * [new branch]                gh/tugsbayasgalan/75/base -> origin/gh/tugsbayasgalan/75/base
2025-12-04T11:11:09.6362668Z  * [new branch]                gh/tugsbayasgalan/75/head -> origin/gh/tugsbayasgalan/75/head
2025-12-04T11:11:09.6362754Z  * [new branch]                gh/tugsbayasgalan/75/orig -> origin/gh/tugsbayasgalan/75/orig
2025-12-04T11:11:09.6362843Z  * [new branch]                gh/tugsbayasgalan/76/base -> origin/gh/tugsbayasgalan/76/base
2025-12-04T11:11:09.6362930Z  * [new branch]                gh/tugsbayasgalan/76/head -> origin/gh/tugsbayasgalan/76/head
2025-12-04T11:11:09.6363015Z  * [new branch]                gh/tugsbayasgalan/76/orig -> origin/gh/tugsbayasgalan/76/orig
2025-12-04T11:11:09.6363104Z  * [new branch]                gh/tugsbayasgalan/77/base -> origin/gh/tugsbayasgalan/77/base
2025-12-04T11:11:09.6363189Z  * [new branch]                gh/tugsbayasgalan/77/head -> origin/gh/tugsbayasgalan/77/head
2025-12-04T11:11:09.6363301Z  * [new branch]                gh/tugsbayasgalan/77/orig -> origin/gh/tugsbayasgalan/77/orig
2025-12-04T11:11:09.6363386Z  * [new branch]                gh/tugsbayasgalan/78/base -> origin/gh/tugsbayasgalan/78/base
2025-12-04T11:11:09.6363471Z  * [new branch]                gh/tugsbayasgalan/78/head -> origin/gh/tugsbayasgalan/78/head
2025-12-04T11:11:09.6363584Z  * [new branch]                gh/tugsbayasgalan/78/orig -> origin/gh/tugsbayasgalan/78/orig
2025-12-04T11:11:09.6363669Z  * [new branch]                gh/tugsbayasgalan/79/base -> origin/gh/tugsbayasgalan/79/base
2025-12-04T11:11:09.6363755Z  * [new branch]                gh/tugsbayasgalan/79/head -> origin/gh/tugsbayasgalan/79/head
2025-12-04T11:11:09.6363844Z  * [new branch]                gh/tugsbayasgalan/79/orig -> origin/gh/tugsbayasgalan/79/orig
2025-12-04T11:11:09.6363927Z  * [new branch]                gh/tugsbayasgalan/8/base -> origin/gh/tugsbayasgalan/8/base
2025-12-04T11:11:09.6364011Z  * [new branch]                gh/tugsbayasgalan/8/head -> origin/gh/tugsbayasgalan/8/head
2025-12-04T11:11:09.6364101Z  * [new branch]                gh/tugsbayasgalan/8/orig -> origin/gh/tugsbayasgalan/8/orig
2025-12-04T11:11:09.6364186Z  * [new branch]                gh/tugsbayasgalan/80/base -> origin/gh/tugsbayasgalan/80/base
2025-12-04T11:11:09.6364274Z  * [new branch]                gh/tugsbayasgalan/80/head -> origin/gh/tugsbayasgalan/80/head
2025-12-04T11:11:09.6364365Z  * [new branch]                gh/tugsbayasgalan/80/orig -> origin/gh/tugsbayasgalan/80/orig
2025-12-04T11:11:09.6364450Z  * [new branch]                gh/tugsbayasgalan/81/base -> origin/gh/tugsbayasgalan/81/base
2025-12-04T11:11:09.6364535Z  * [new branch]                gh/tugsbayasgalan/81/head -> origin/gh/tugsbayasgalan/81/head
2025-12-04T11:11:09.6364624Z  * [new branch]                gh/tugsbayasgalan/81/orig -> origin/gh/tugsbayasgalan/81/orig
2025-12-04T11:11:09.6364708Z  * [new branch]                gh/tugsbayasgalan/82/base -> origin/gh/tugsbayasgalan/82/base
2025-12-04T11:11:09.6364794Z  * [new branch]                gh/tugsbayasgalan/82/head -> origin/gh/tugsbayasgalan/82/head
2025-12-04T11:11:09.6364883Z  * [new branch]                gh/tugsbayasgalan/82/orig -> origin/gh/tugsbayasgalan/82/orig
2025-12-04T11:11:09.6364968Z  * [new branch]                gh/tugsbayasgalan/83/base -> origin/gh/tugsbayasgalan/83/base
2025-12-04T11:11:09.6365057Z  * [new branch]                gh/tugsbayasgalan/83/head -> origin/gh/tugsbayasgalan/83/head
2025-12-04T11:11:09.6365142Z  * [new branch]                gh/tugsbayasgalan/83/orig -> origin/gh/tugsbayasgalan/83/orig
2025-12-04T11:11:09.6365226Z  * [new branch]                gh/tugsbayasgalan/84/base -> origin/gh/tugsbayasgalan/84/base
2025-12-04T11:11:09.6365316Z  * [new branch]                gh/tugsbayasgalan/84/head -> origin/gh/tugsbayasgalan/84/head
2025-12-04T11:11:09.6365400Z  * [new branch]                gh/tugsbayasgalan/84/orig -> origin/gh/tugsbayasgalan/84/orig
2025-12-04T11:11:09.6365485Z  * [new branch]                gh/tugsbayasgalan/85/base -> origin/gh/tugsbayasgalan/85/base
2025-12-04T11:11:09.6365576Z  * [new branch]                gh/tugsbayasgalan/85/head -> origin/gh/tugsbayasgalan/85/head
2025-12-04T11:11:09.6365660Z  * [new branch]                gh/tugsbayasgalan/85/orig -> origin/gh/tugsbayasgalan/85/orig
2025-12-04T11:11:09.6365745Z  * [new branch]                gh/tugsbayasgalan/86/base -> origin/gh/tugsbayasgalan/86/base
2025-12-04T11:11:09.6365833Z  * [new branch]                gh/tugsbayasgalan/86/head -> origin/gh/tugsbayasgalan/86/head
2025-12-04T11:11:09.6365917Z  * [new branch]                gh/tugsbayasgalan/86/orig -> origin/gh/tugsbayasgalan/86/orig
2025-12-04T11:11:09.6366001Z  * [new branch]                gh/tugsbayasgalan/87/base -> origin/gh/tugsbayasgalan/87/base
2025-12-04T11:11:09.6366087Z  * [new branch]                gh/tugsbayasgalan/87/head -> origin/gh/tugsbayasgalan/87/head
2025-12-04T11:11:09.6366172Z  * [new branch]                gh/tugsbayasgalan/87/orig -> origin/gh/tugsbayasgalan/87/orig
2025-12-04T11:11:09.6366276Z  * [new branch]                gh/tugsbayasgalan/88/base -> origin/gh/tugsbayasgalan/88/base
2025-12-04T11:11:09.6366366Z  * [new branch]                gh/tugsbayasgalan/88/head -> origin/gh/tugsbayasgalan/88/head
2025-12-04T11:11:09.6366473Z  * [new branch]                gh/tugsbayasgalan/88/orig -> origin/gh/tugsbayasgalan/88/orig
2025-12-04T11:11:09.6366561Z  * [new branch]                gh/tugsbayasgalan/89/base -> origin/gh/tugsbayasgalan/89/base
2025-12-04T11:11:09.6366646Z  * [new branch]                gh/tugsbayasgalan/89/head -> origin/gh/tugsbayasgalan/89/head
2025-12-04T11:11:09.6366731Z  * [new branch]                gh/tugsbayasgalan/89/orig -> origin/gh/tugsbayasgalan/89/orig
2025-12-04T11:11:09.6366819Z  * [new branch]                gh/tugsbayasgalan/9/base -> origin/gh/tugsbayasgalan/9/base
2025-12-04T11:11:09.6366902Z  * [new branch]                gh/tugsbayasgalan/9/head -> origin/gh/tugsbayasgalan/9/head
2025-12-04T11:11:09.6366987Z  * [new branch]                gh/tugsbayasgalan/9/orig -> origin/gh/tugsbayasgalan/9/orig
2025-12-04T11:11:09.6367078Z  * [new branch]                gh/tugsbayasgalan/90/base -> origin/gh/tugsbayasgalan/90/base
2025-12-04T11:11:09.6367164Z  * [new branch]                gh/tugsbayasgalan/90/head -> origin/gh/tugsbayasgalan/90/head
2025-12-04T11:11:09.6367253Z  * [new branch]                gh/tugsbayasgalan/90/orig -> origin/gh/tugsbayasgalan/90/orig
2025-12-04T11:11:09.6367344Z  * [new branch]                gh/tugsbayasgalan/91/base -> origin/gh/tugsbayasgalan/91/base
2025-12-04T11:11:09.6367429Z  * [new branch]                gh/tugsbayasgalan/91/head -> origin/gh/tugsbayasgalan/91/head
2025-12-04T11:11:09.6367514Z  * [new branch]                gh/tugsbayasgalan/91/orig -> origin/gh/tugsbayasgalan/91/orig
2025-12-04T11:11:09.6367603Z  * [new branch]                gh/tugsbayasgalan/92/base -> origin/gh/tugsbayasgalan/92/base
2025-12-04T11:11:09.6367687Z  * [new branch]                gh/tugsbayasgalan/92/head -> origin/gh/tugsbayasgalan/92/head
2025-12-04T11:11:09.6367773Z  * [new branch]                gh/tugsbayasgalan/92/orig -> origin/gh/tugsbayasgalan/92/orig
2025-12-04T11:11:09.6367863Z  * [new branch]                gh/tugsbayasgalan/93/base -> origin/gh/tugsbayasgalan/93/base
2025-12-04T11:11:09.6367948Z  * [new branch]                gh/tugsbayasgalan/93/head -> origin/gh/tugsbayasgalan/93/head
2025-12-04T11:11:09.6368033Z  * [new branch]                gh/tugsbayasgalan/93/orig -> origin/gh/tugsbayasgalan/93/orig
2025-12-04T11:11:09.6368108Z  * [new branch]                gh/v0i0/14/base         -> origin/gh/v0i0/14/base
2025-12-04T11:11:09.6368207Z  * [new branch]                gh/v0i0/14/head         -> origin/gh/v0i0/14/head
2025-12-04T11:11:09.6368279Z  * [new branch]                gh/v0i0/14/orig         -> origin/gh/v0i0/14/orig
2025-12-04T11:11:09.6368346Z  * [new branch]                gh/v0i0/15/base         -> origin/gh/v0i0/15/base
2025-12-04T11:11:09.6368410Z  * [new branch]                gh/v0i0/15/head         -> origin/gh/v0i0/15/head
2025-12-04T11:11:09.6368481Z  * [new branch]                gh/v0i0/15/orig         -> origin/gh/v0i0/15/orig
2025-12-04T11:11:09.6368548Z  * [new branch]                gh/v0i0/16/base         -> origin/gh/v0i0/16/base
2025-12-04T11:11:09.6368614Z  * [new branch]                gh/v0i0/16/head         -> origin/gh/v0i0/16/head
2025-12-04T11:11:09.6368684Z  * [new branch]                gh/v0i0/16/orig         -> origin/gh/v0i0/16/orig
2025-12-04T11:11:09.6368751Z  * [new branch]                gh/v0i0/17/base         -> origin/gh/v0i0/17/base
2025-12-04T11:11:09.6368817Z  * [new branch]                gh/v0i0/17/head         -> origin/gh/v0i0/17/head
2025-12-04T11:11:09.6368884Z  * [new branch]                gh/v0i0/17/orig         -> origin/gh/v0i0/17/orig
2025-12-04T11:11:09.6368949Z  * [new branch]                gh/v0i0/18/base         -> origin/gh/v0i0/18/base
2025-12-04T11:11:09.6369013Z  * [new branch]                gh/v0i0/18/head         -> origin/gh/v0i0/18/head
2025-12-04T11:11:09.6369106Z  * [new branch]                gh/v0i0/18/orig         -> origin/gh/v0i0/18/orig
2025-12-04T11:11:09.6369172Z  * [new branch]                gh/v0i0/19/base         -> origin/gh/v0i0/19/base
2025-12-04T11:11:09.6369238Z  * [new branch]                gh/v0i0/19/head         -> origin/gh/v0i0/19/head
2025-12-04T11:11:09.6369342Z  * [new branch]                gh/v0i0/19/orig         -> origin/gh/v0i0/19/orig
2025-12-04T11:11:09.6369426Z  * [new branch]                gh/vishal9-team/1/base  -> origin/gh/vishal9-team/1/base
2025-12-04T11:11:09.6369504Z  * [new branch]                gh/vishal9-team/1/head  -> origin/gh/vishal9-team/1/head
2025-12-04T11:11:09.6369583Z  * [new branch]                gh/vishal9-team/2/base  -> origin/gh/vishal9-team/2/base
2025-12-04T11:11:09.6369659Z  * [new branch]                gh/vishal9-team/2/head  -> origin/gh/vishal9-team/2/head
2025-12-04T11:11:09.6369735Z  * [new branch]                gh/vishal9-team/2/orig  -> origin/gh/vishal9-team/2/orig
2025-12-04T11:11:09.6369814Z  * [new branch]                gh/vishal9-team/3/base  -> origin/gh/vishal9-team/3/base
2025-12-04T11:11:09.6369890Z  * [new branch]                gh/vishal9-team/3/head  -> origin/gh/vishal9-team/3/head
2025-12-04T11:11:09.6369967Z  * [new branch]                gh/vishal9-team/3/orig  -> origin/gh/vishal9-team/3/orig
2025-12-04T11:11:09.6370044Z  * [new branch]                gh/vishal9-team/4/base  -> origin/gh/vishal9-team/4/base
2025-12-04T11:11:09.6370120Z  * [new branch]                gh/vishal9-team/4/head  -> origin/gh/vishal9-team/4/head
2025-12-04T11:11:09.6370197Z  * [new branch]                gh/vishal9-team/4/orig  -> origin/gh/vishal9-team/4/orig
2025-12-04T11:11:09.6370264Z  * [new branch]                gh/vkuzo/1/next         -> origin/gh/vkuzo/1/next
2025-12-04T11:11:09.6370332Z  * [new branch]                gh/vkuzo/2/next         -> origin/gh/vkuzo/2/next
2025-12-04T11:11:09.6370404Z  * [new branch]                gh/vkuzo/3/next         -> origin/gh/vkuzo/3/next
2025-12-04T11:11:09.6370483Z  * [new branch]                gh/wconstab/424/base    -> origin/gh/wconstab/424/base
2025-12-04T11:11:09.6370561Z  * [new branch]                gh/wconstab/424/head    -> origin/gh/wconstab/424/head
2025-12-04T11:11:09.6370638Z  * [new branch]                gh/wconstab/424/orig    -> origin/gh/wconstab/424/orig
2025-12-04T11:11:09.6370710Z  * [new branch]                gh/wconstab/435/base    -> origin/gh/wconstab/435/base
2025-12-04T11:11:09.6370783Z  * [new branch]                gh/wconstab/435/head    -> origin/gh/wconstab/435/head
2025-12-04T11:11:09.6370856Z  * [new branch]                gh/wconstab/435/orig    -> origin/gh/wconstab/435/orig
2025-12-04T11:11:09.6370927Z  * [new branch]                gh/wconstab/444/base    -> origin/gh/wconstab/444/base
2025-12-04T11:11:09.6370999Z  * [new branch]                gh/wconstab/444/head    -> origin/gh/wconstab/444/head
2025-12-04T11:11:09.6371074Z  * [new branch]                gh/wconstab/444/orig    -> origin/gh/wconstab/444/orig
2025-12-04T11:11:09.6371147Z  * [new branch]                gh/wconstab/447/base    -> origin/gh/wconstab/447/base
2025-12-04T11:11:09.6371219Z  * [new branch]                gh/wconstab/447/head    -> origin/gh/wconstab/447/head
2025-12-04T11:11:09.6371296Z  * [new branch]                gh/wconstab/447/orig    -> origin/gh/wconstab/447/orig
2025-12-04T11:11:09.6371367Z  * [new branch]                gh/wconstab/448/base    -> origin/gh/wconstab/448/base
2025-12-04T11:11:09.6371438Z  * [new branch]                gh/wconstab/448/head    -> origin/gh/wconstab/448/head
2025-12-04T11:11:09.6371514Z  * [new branch]                gh/wconstab/448/orig    -> origin/gh/wconstab/448/orig
2025-12-04T11:11:09.6371587Z  * [new branch]                gh/wconstab/449/base    -> origin/gh/wconstab/449/base
2025-12-04T11:11:09.6371662Z  * [new branch]                gh/wconstab/449/head    -> origin/gh/wconstab/449/head
2025-12-04T11:11:09.6371734Z  * [new branch]                gh/wconstab/449/orig    -> origin/gh/wconstab/449/orig
2025-12-04T11:11:09.6371829Z  * [new branch]                gh/wconstab/450/base    -> origin/gh/wconstab/450/base
2025-12-04T11:11:09.6371903Z  * [new branch]                gh/wconstab/450/head    -> origin/gh/wconstab/450/head
2025-12-04T11:11:09.6371997Z  * [new branch]                gh/wconstab/450/orig    -> origin/gh/wconstab/450/orig
2025-12-04T11:11:09.6372068Z  * [new branch]                gh/wconstab/451/base    -> origin/gh/wconstab/451/base
2025-12-04T11:11:09.6372144Z  * [new branch]                gh/wconstab/451/head    -> origin/gh/wconstab/451/head
2025-12-04T11:11:09.6372217Z  * [new branch]                gh/wconstab/451/orig    -> origin/gh/wconstab/451/orig
2025-12-04T11:11:09.6372289Z  * [new branch]                gh/wconstab/452/base    -> origin/gh/wconstab/452/base
2025-12-04T11:11:09.6372363Z  * [new branch]                gh/wconstab/452/head    -> origin/gh/wconstab/452/head
2025-12-04T11:11:09.6372435Z  * [new branch]                gh/wconstab/452/orig    -> origin/gh/wconstab/452/orig
2025-12-04T11:11:09.6372509Z  * [new branch]                gh/wconstab/453/base    -> origin/gh/wconstab/453/base
2025-12-04T11:11:09.6372583Z  * [new branch]                gh/wconstab/453/head    -> origin/gh/wconstab/453/head
2025-12-04T11:11:09.6372656Z  * [new branch]                gh/wconstab/453/orig    -> origin/gh/wconstab/453/orig
2025-12-04T11:11:09.6372729Z  * [new branch]                gh/wconstab/454/base    -> origin/gh/wconstab/454/base
2025-12-04T11:11:09.6372807Z  * [new branch]                gh/wconstab/454/head    -> origin/gh/wconstab/454/head
2025-12-04T11:11:09.6372880Z  * [new branch]                gh/wconstab/454/orig    -> origin/gh/wconstab/454/orig
2025-12-04T11:11:09.6372952Z  * [new branch]                gh/wconstab/455/base    -> origin/gh/wconstab/455/base
2025-12-04T11:11:09.6373027Z  * [new branch]                gh/wconstab/455/head    -> origin/gh/wconstab/455/head
2025-12-04T11:11:09.6373100Z  * [new branch]                gh/wconstab/455/orig    -> origin/gh/wconstab/455/orig
2025-12-04T11:11:09.6373172Z  * [new branch]                gh/wconstab/456/base    -> origin/gh/wconstab/456/base
2025-12-04T11:11:09.6373246Z  * [new branch]                gh/wconstab/456/head    -> origin/gh/wconstab/456/head
2025-12-04T11:11:09.6373320Z  * [new branch]                gh/wconstab/456/orig    -> origin/gh/wconstab/456/orig
2025-12-04T11:11:09.6373394Z  * [new branch]                gh/wconstab/457/base    -> origin/gh/wconstab/457/base
2025-12-04T11:11:09.6373467Z  * [new branch]                gh/wconstab/457/head    -> origin/gh/wconstab/457/head
2025-12-04T11:11:09.6373543Z  * [new branch]                gh/wconstab/457/orig    -> origin/gh/wconstab/457/orig
2025-12-04T11:11:09.6373636Z  * [new branch]                gh/wconstab/458/base    -> origin/gh/wconstab/458/base
2025-12-04T11:11:09.6373713Z  * [new branch]                gh/wconstab/458/head    -> origin/gh/wconstab/458/head
2025-12-04T11:11:09.6373787Z  * [new branch]                gh/wconstab/458/orig    -> origin/gh/wconstab/458/orig
2025-12-04T11:11:09.6373861Z  * [new branch]                gh/wconstab/459/base    -> origin/gh/wconstab/459/base
2025-12-04T11:11:09.6373932Z  * [new branch]                gh/wconstab/459/head    -> origin/gh/wconstab/459/head
2025-12-04T11:11:09.6374006Z  * [new branch]                gh/wconstab/459/orig    -> origin/gh/wconstab/459/orig
2025-12-04T11:11:09.6374079Z  * [new branch]                gh/wconstab/460/base    -> origin/gh/wconstab/460/base
2025-12-04T11:11:09.6374150Z  * [new branch]                gh/wconstab/460/head    -> origin/gh/wconstab/460/head
2025-12-04T11:11:09.6374222Z  * [new branch]                gh/wconstab/460/orig    -> origin/gh/wconstab/460/orig
2025-12-04T11:11:09.6374296Z  * [new branch]                gh/wconstab/461/base    -> origin/gh/wconstab/461/base
2025-12-04T11:11:09.6374367Z  * [new branch]                gh/wconstab/461/head    -> origin/gh/wconstab/461/head
2025-12-04T11:11:09.6374458Z  * [new branch]                gh/wconstab/461/orig    -> origin/gh/wconstab/461/orig
2025-12-04T11:11:09.6374533Z  * [new branch]                gh/wconstab/462/base    -> origin/gh/wconstab/462/base
2025-12-04T11:11:09.6374605Z  * [new branch]                gh/wconstab/462/head    -> origin/gh/wconstab/462/head
2025-12-04T11:11:09.6374700Z  * [new branch]                gh/wconstab/462/orig    -> origin/gh/wconstab/462/orig
2025-12-04T11:11:09.6374774Z  * [new branch]                gh/wconstab/463/base    -> origin/gh/wconstab/463/base
2025-12-04T11:11:09.6374846Z  * [new branch]                gh/wconstab/463/head    -> origin/gh/wconstab/463/head
2025-12-04T11:11:09.6374921Z  * [new branch]                gh/wconstab/463/orig    -> origin/gh/wconstab/463/orig
2025-12-04T11:11:09.6374994Z  * [new branch]                gh/wconstab/464/base    -> origin/gh/wconstab/464/base
2025-12-04T11:11:09.6375067Z  * [new branch]                gh/wconstab/464/head    -> origin/gh/wconstab/464/head
2025-12-04T11:11:09.6375145Z  * [new branch]                gh/wconstab/464/orig    -> origin/gh/wconstab/464/orig
2025-12-04T11:11:09.6375216Z  * [new branch]                gh/wconstab/465/base    -> origin/gh/wconstab/465/base
2025-12-04T11:11:09.6375288Z  * [new branch]                gh/wconstab/465/head    -> origin/gh/wconstab/465/head
2025-12-04T11:11:09.6375370Z  * [new branch]                gh/wconstab/465/orig    -> origin/gh/wconstab/465/orig
2025-12-04T11:11:09.6375441Z  * [new branch]                gh/wconstab/466/base    -> origin/gh/wconstab/466/base
2025-12-04T11:11:09.6375513Z  * [new branch]                gh/wconstab/466/head    -> origin/gh/wconstab/466/head
2025-12-04T11:11:09.6375588Z  * [new branch]                gh/wconstab/466/orig    -> origin/gh/wconstab/466/orig
2025-12-04T11:11:09.6375659Z  * [new branch]                gh/wconstab/467/base    -> origin/gh/wconstab/467/base
2025-12-04T11:11:09.6375731Z  * [new branch]                gh/wconstab/467/head    -> origin/gh/wconstab/467/head
2025-12-04T11:11:09.6375806Z  * [new branch]                gh/wconstab/467/orig    -> origin/gh/wconstab/467/orig
2025-12-04T11:11:09.6375879Z  * [new branch]                gh/wconstab/468/base    -> origin/gh/wconstab/468/base
2025-12-04T11:11:09.6375950Z  * [new branch]                gh/wconstab/468/head    -> origin/gh/wconstab/468/head
2025-12-04T11:11:09.6376028Z  * [new branch]                gh/wconstab/468/orig    -> origin/gh/wconstab/468/orig
2025-12-04T11:11:09.6376101Z  * [new branch]                gh/weifengpy/39/base    -> origin/gh/weifengpy/39/base
2025-12-04T11:11:09.6376174Z  * [new branch]                gh/weifengpy/39/head    -> origin/gh/weifengpy/39/head
2025-12-04T11:11:09.6376249Z  * [new branch]                gh/weifengpy/39/orig    -> origin/gh/weifengpy/39/orig
2025-12-04T11:11:09.6376321Z  * [new branch]                gh/weifengpy/40/base    -> origin/gh/weifengpy/40/base
2025-12-04T11:11:09.6376396Z  * [new branch]                gh/weifengpy/40/head    -> origin/gh/weifengpy/40/head
2025-12-04T11:11:09.6376470Z  * [new branch]                gh/weifengpy/40/orig    -> origin/gh/weifengpy/40/orig
2025-12-04T11:11:09.6376543Z  * [new branch]                gh/weifengpy/41/base    -> origin/gh/weifengpy/41/base
2025-12-04T11:11:09.6376618Z  * [new branch]                gh/weifengpy/41/head    -> origin/gh/weifengpy/41/head
2025-12-04T11:11:09.6376692Z  * [new branch]                gh/weifengpy/41/orig    -> origin/gh/weifengpy/41/orig
2025-12-04T11:11:09.6376777Z  * [new branch]                gh/williamwen42/250/base -> origin/gh/williamwen42/250/base
2025-12-04T11:11:09.6376862Z  * [new branch]                gh/williamwen42/250/head -> origin/gh/williamwen42/250/head
2025-12-04T11:11:09.6376943Z  * [new branch]                gh/williamwen42/250/orig -> origin/gh/williamwen42/250/orig
2025-12-04T11:11:09.6377022Z  * [new branch]                gh/williamwen42/279/base -> origin/gh/williamwen42/279/base
2025-12-04T11:11:09.6377104Z  * [new branch]                gh/williamwen42/279/head -> origin/gh/williamwen42/279/head
2025-12-04T11:11:09.6377205Z  * [new branch]                gh/williamwen42/279/orig -> origin/gh/williamwen42/279/orig
2025-12-04T11:11:09.6377284Z  * [new branch]                gh/williamwen42/282/base -> origin/gh/williamwen42/282/base
2025-12-04T11:11:09.6377385Z  * [new branch]                gh/williamwen42/282/head -> origin/gh/williamwen42/282/head
2025-12-04T11:11:09.6377464Z  * [new branch]                gh/williamwen42/282/orig -> origin/gh/williamwen42/282/orig
2025-12-04T11:11:09.6377542Z  * [new branch]                gh/williamwen42/287/base -> origin/gh/williamwen42/287/base
2025-12-04T11:11:09.6377623Z  * [new branch]                gh/williamwen42/287/head -> origin/gh/williamwen42/287/head
2025-12-04T11:11:09.6377703Z  * [new branch]                gh/williamwen42/287/orig -> origin/gh/williamwen42/287/orig
2025-12-04T11:11:09.6377782Z  * [new branch]                gh/williamwen42/288/base -> origin/gh/williamwen42/288/base
2025-12-04T11:11:09.6377863Z  * [new branch]                gh/williamwen42/288/head -> origin/gh/williamwen42/288/head
2025-12-04T11:11:09.6377942Z  * [new branch]                gh/williamwen42/288/orig -> origin/gh/williamwen42/288/orig
2025-12-04T11:11:09.6378023Z  * [new branch]                gh/williamwen42/296/base -> origin/gh/williamwen42/296/base
2025-12-04T11:11:09.6378104Z  * [new branch]                gh/williamwen42/296/head -> origin/gh/williamwen42/296/head
2025-12-04T11:11:09.6378215Z  * [new branch]                gh/williamwen42/296/orig -> origin/gh/williamwen42/296/orig
2025-12-04T11:11:09.6378296Z  * [new branch]                gh/williamwen42/297/base -> origin/gh/williamwen42/297/base
2025-12-04T11:11:09.6378377Z  * [new branch]                gh/williamwen42/297/head -> origin/gh/williamwen42/297/head
2025-12-04T11:11:09.6378456Z  * [new branch]                gh/williamwen42/297/orig -> origin/gh/williamwen42/297/orig
2025-12-04T11:11:09.6378538Z  * [new branch]                gh/williamwen42/306/base -> origin/gh/williamwen42/306/base
2025-12-04T11:11:09.6378618Z  * [new branch]                gh/williamwen42/306/head -> origin/gh/williamwen42/306/head
2025-12-04T11:11:09.6378697Z  * [new branch]                gh/williamwen42/306/orig -> origin/gh/williamwen42/306/orig
2025-12-04T11:11:09.6378779Z  * [new branch]                gh/williamwen42/309/base -> origin/gh/williamwen42/309/base
2025-12-04T11:11:09.6378857Z  * [new branch]                gh/williamwen42/309/head -> origin/gh/williamwen42/309/head
2025-12-04T11:11:09.6378936Z  * [new branch]                gh/williamwen42/309/orig -> origin/gh/williamwen42/309/orig
2025-12-04T11:11:09.6379017Z  * [new branch]                gh/williamwen42/310/base -> origin/gh/williamwen42/310/base
2025-12-04T11:11:09.6379095Z  * [new branch]                gh/williamwen42/310/head -> origin/gh/williamwen42/310/head
2025-12-04T11:11:09.6379174Z  * [new branch]                gh/williamwen42/310/orig -> origin/gh/williamwen42/310/orig
2025-12-04T11:11:09.6379256Z  * [new branch]                gh/williamwen42/311/base -> origin/gh/williamwen42/311/base
2025-12-04T11:11:09.6379335Z  * [new branch]                gh/williamwen42/311/head -> origin/gh/williamwen42/311/head
2025-12-04T11:11:09.6379415Z  * [new branch]                gh/williamwen42/311/orig -> origin/gh/williamwen42/311/orig
2025-12-04T11:11:09.6379496Z  * [new branch]                gh/williamwen42/319/base -> origin/gh/williamwen42/319/base
2025-12-04T11:11:09.6379574Z  * [new branch]                gh/williamwen42/319/head -> origin/gh/williamwen42/319/head
2025-12-04T11:11:09.6379654Z  * [new branch]                gh/williamwen42/319/orig -> origin/gh/williamwen42/319/orig
2025-12-04T11:11:09.6379733Z  * [new branch]                gh/williamwen42/325/base -> origin/gh/williamwen42/325/base
2025-12-04T11:11:09.6379813Z  * [new branch]                gh/williamwen42/325/head -> origin/gh/williamwen42/325/head
2025-12-04T11:11:09.6379896Z  * [new branch]                gh/williamwen42/325/orig -> origin/gh/williamwen42/325/orig
2025-12-04T11:11:09.6380008Z  * [new branch]                gh/williamwen42/326/base -> origin/gh/williamwen42/326/base
2025-12-04T11:11:09.6380087Z  * [new branch]                gh/williamwen42/326/head -> origin/gh/williamwen42/326/head
2025-12-04T11:11:09.6380203Z  * [new branch]                gh/williamwen42/326/orig -> origin/gh/williamwen42/326/orig
2025-12-04T11:11:09.6380281Z  * [new branch]                gh/williamwen42/327/base -> origin/gh/williamwen42/327/base
2025-12-04T11:11:09.6380363Z  * [new branch]                gh/williamwen42/327/head -> origin/gh/williamwen42/327/head
2025-12-04T11:11:09.6380444Z  * [new branch]                gh/williamwen42/327/orig -> origin/gh/williamwen42/327/orig
2025-12-04T11:11:09.6380523Z  * [new branch]                gh/williamwen42/328/base -> origin/gh/williamwen42/328/base
2025-12-04T11:11:09.6380602Z  * [new branch]                gh/williamwen42/328/head -> origin/gh/williamwen42/328/head
2025-12-04T11:11:09.6380685Z  * [new branch]                gh/williamwen42/328/orig -> origin/gh/williamwen42/328/orig
2025-12-04T11:11:09.6380764Z  * [new branch]                gh/williamwen42/329/base -> origin/gh/williamwen42/329/base
2025-12-04T11:11:09.6380842Z  * [new branch]                gh/williamwen42/329/head -> origin/gh/williamwen42/329/head
2025-12-04T11:11:09.6380924Z  * [new branch]                gh/williamwen42/329/orig -> origin/gh/williamwen42/329/orig
2025-12-04T11:11:09.6381003Z  * [new branch]                gh/williamwen42/330/base -> origin/gh/williamwen42/330/base
2025-12-04T11:11:09.6381084Z  * [new branch]                gh/williamwen42/330/head -> origin/gh/williamwen42/330/head
2025-12-04T11:11:09.6381162Z  * [new branch]                gh/williamwen42/330/orig -> origin/gh/williamwen42/330/orig
2025-12-04T11:11:09.6381239Z  * [new branch]                gh/williamwen42/331/base -> origin/gh/williamwen42/331/base
2025-12-04T11:11:09.6381320Z  * [new branch]                gh/williamwen42/331/head -> origin/gh/williamwen42/331/head
2025-12-04T11:11:09.6381399Z  * [new branch]                gh/williamwen42/331/orig -> origin/gh/williamwen42/331/orig
2025-12-04T11:11:09.6381479Z  * [new branch]                gh/williamwen42/332/base -> origin/gh/williamwen42/332/base
2025-12-04T11:11:09.6381564Z  * [new branch]                gh/williamwen42/332/head -> origin/gh/williamwen42/332/head
2025-12-04T11:11:09.6381642Z  * [new branch]                gh/williamwen42/332/orig -> origin/gh/williamwen42/332/orig
2025-12-04T11:11:09.6381721Z  * [new branch]                gh/williamwen42/333/base -> origin/gh/williamwen42/333/base
2025-12-04T11:11:09.6381801Z  * [new branch]                gh/williamwen42/333/head -> origin/gh/williamwen42/333/head
2025-12-04T11:11:09.6381880Z  * [new branch]                gh/williamwen42/333/orig -> origin/gh/williamwen42/333/orig
2025-12-04T11:11:09.6381959Z  * [new branch]                gh/williamwen42/334/base -> origin/gh/williamwen42/334/base
2025-12-04T11:11:09.6382042Z  * [new branch]                gh/williamwen42/334/head -> origin/gh/williamwen42/334/head
2025-12-04T11:11:09.6382121Z  * [new branch]                gh/williamwen42/334/orig -> origin/gh/williamwen42/334/orig
2025-12-04T11:11:09.6382200Z  * [new branch]                gh/williamwen42/335/base -> origin/gh/williamwen42/335/base
2025-12-04T11:11:09.6382283Z  * [new branch]                gh/williamwen42/335/head -> origin/gh/williamwen42/335/head
2025-12-04T11:11:09.6382362Z  * [new branch]                gh/williamwen42/335/orig -> origin/gh/williamwen42/335/orig
2025-12-04T11:11:09.6382440Z  * [new branch]                gh/williamwen42/336/base -> origin/gh/williamwen42/336/base
2025-12-04T11:11:09.6382521Z  * [new branch]                gh/williamwen42/336/head -> origin/gh/williamwen42/336/head
2025-12-04T11:11:09.6382602Z  * [new branch]                gh/williamwen42/336/orig -> origin/gh/williamwen42/336/orig
2025-12-04T11:11:09.6382682Z  * [new branch]                gh/williamwen42/337/base -> origin/gh/williamwen42/337/base
2025-12-04T11:11:09.6382783Z  * [new branch]                gh/williamwen42/337/head -> origin/gh/williamwen42/337/head
2025-12-04T11:11:09.6382862Z  * [new branch]                gh/williamwen42/337/orig -> origin/gh/williamwen42/337/orig
2025-12-04T11:11:09.6382966Z  * [new branch]                gh/williamwen42/338/base -> origin/gh/williamwen42/338/base
2025-12-04T11:11:09.6383047Z  * [new branch]                gh/williamwen42/338/head -> origin/gh/williamwen42/338/head
2025-12-04T11:11:09.6383126Z  * [new branch]                gh/williamwen42/338/orig -> origin/gh/williamwen42/338/orig
2025-12-04T11:11:09.6383209Z  * [new branch]                gh/williamwen42/339/base -> origin/gh/williamwen42/339/base
2025-12-04T11:11:09.6383289Z  * [new branch]                gh/williamwen42/339/head -> origin/gh/williamwen42/339/head
2025-12-04T11:11:09.6383369Z  * [new branch]                gh/williamwen42/339/orig -> origin/gh/williamwen42/339/orig
2025-12-04T11:11:09.6383451Z  * [new branch]                gh/williamwen42/340/base -> origin/gh/williamwen42/340/base
2025-12-04T11:11:09.6383532Z  * [new branch]                gh/williamwen42/340/head -> origin/gh/williamwen42/340/head
2025-12-04T11:11:09.6383613Z  * [new branch]                gh/williamwen42/340/orig -> origin/gh/williamwen42/340/orig
2025-12-04T11:11:09.6383696Z  * [new branch]                gh/williamwen42/341/base -> origin/gh/williamwen42/341/base
2025-12-04T11:11:09.6383775Z  * [new branch]                gh/williamwen42/341/head -> origin/gh/williamwen42/341/head
2025-12-04T11:11:09.6383853Z  * [new branch]                gh/williamwen42/341/orig -> origin/gh/williamwen42/341/orig
2025-12-04T11:11:09.6383934Z  * [new branch]                gh/williamwen42/342/base -> origin/gh/williamwen42/342/base
2025-12-04T11:11:09.6384015Z  * [new branch]                gh/williamwen42/342/head -> origin/gh/williamwen42/342/head
2025-12-04T11:11:09.6384101Z  * [new branch]                gh/williamwen42/342/orig -> origin/gh/williamwen42/342/orig
2025-12-04T11:11:09.6384182Z  * [new branch]                gh/williamwen42/343/base -> origin/gh/williamwen42/343/base
2025-12-04T11:11:09.6384264Z  * [new branch]                gh/williamwen42/343/head -> origin/gh/williamwen42/343/head
2025-12-04T11:11:09.6384349Z  * [new branch]                gh/williamwen42/343/orig -> origin/gh/williamwen42/343/orig
2025-12-04T11:11:09.6384430Z  * [new branch]                gh/williamwen42/344/base -> origin/gh/williamwen42/344/base
2025-12-04T11:11:09.6384514Z  * [new branch]                gh/williamwen42/344/head -> origin/gh/williamwen42/344/head
2025-12-04T11:11:09.6384599Z  * [new branch]                gh/williamwen42/344/orig -> origin/gh/williamwen42/344/orig
2025-12-04T11:11:09.6384680Z  * [new branch]                gh/williamwen42/345/base -> origin/gh/williamwen42/345/base
2025-12-04T11:11:09.6384759Z  * [new branch]                gh/williamwen42/345/head -> origin/gh/williamwen42/345/head
2025-12-04T11:11:09.6384841Z  * [new branch]                gh/williamwen42/345/orig -> origin/gh/williamwen42/345/orig
2025-12-04T11:11:09.6384924Z  * [new branch]                gh/williamwen42/346/base -> origin/gh/williamwen42/346/base
2025-12-04T11:11:09.6385005Z  * [new branch]                gh/williamwen42/346/head -> origin/gh/williamwen42/346/head
2025-12-04T11:11:09.6385091Z  * [new branch]                gh/williamwen42/346/orig -> origin/gh/williamwen42/346/orig
2025-12-04T11:11:09.6385173Z  * [new branch]                gh/williamwen42/347/base -> origin/gh/williamwen42/347/base
2025-12-04T11:11:09.6385254Z  * [new branch]                gh/williamwen42/347/head -> origin/gh/williamwen42/347/head
2025-12-04T11:11:09.6385336Z  * [new branch]                gh/williamwen42/347/orig -> origin/gh/williamwen42/347/orig
2025-12-04T11:11:09.6385415Z  * [new branch]                gh/williamwen42/348/base -> origin/gh/williamwen42/348/base
2025-12-04T11:11:09.6385494Z  * [new branch]                gh/williamwen42/348/head -> origin/gh/williamwen42/348/head
2025-12-04T11:11:09.6385601Z  * [new branch]                gh/williamwen42/348/orig -> origin/gh/williamwen42/348/orig
2025-12-04T11:11:09.6385684Z  * [new branch]                gh/williamwen42/349/base -> origin/gh/williamwen42/349/base
2025-12-04T11:11:09.6385768Z  * [new branch]                gh/williamwen42/349/head -> origin/gh/williamwen42/349/head
2025-12-04T11:11:09.6385875Z  * [new branch]                gh/williamwen42/349/orig -> origin/gh/williamwen42/349/orig
2025-12-04T11:11:09.6385954Z  * [new branch]                gh/williamwen42/350/base -> origin/gh/williamwen42/350/base
2025-12-04T11:11:09.6386040Z  * [new branch]                gh/williamwen42/350/head -> origin/gh/williamwen42/350/head
2025-12-04T11:11:09.6386125Z  * [new branch]                gh/williamwen42/350/orig -> origin/gh/williamwen42/350/orig
2025-12-04T11:11:09.6386205Z  * [new branch]                gh/williamwen42/351/base -> origin/gh/williamwen42/351/base
2025-12-04T11:11:09.6386286Z  * [new branch]                gh/williamwen42/351/head -> origin/gh/williamwen42/351/head
2025-12-04T11:11:09.6386368Z  * [new branch]                gh/williamwen42/351/orig -> origin/gh/williamwen42/351/orig
2025-12-04T11:11:09.6386449Z  * [new branch]                gh/williamwen42/352/base -> origin/gh/williamwen42/352/base
2025-12-04T11:11:09.6386533Z  * [new branch]                gh/williamwen42/352/head -> origin/gh/williamwen42/352/head
2025-12-04T11:11:09.6386615Z  * [new branch]                gh/williamwen42/352/orig -> origin/gh/williamwen42/352/orig
2025-12-04T11:11:09.6386696Z  * [new branch]                gh/williamwen42/353/base -> origin/gh/williamwen42/353/base
2025-12-04T11:11:09.6386778Z  * [new branch]                gh/williamwen42/353/head -> origin/gh/williamwen42/353/head
2025-12-04T11:11:09.6386858Z  * [new branch]                gh/williamwen42/353/orig -> origin/gh/williamwen42/353/orig
2025-12-04T11:11:09.6386937Z  * [new branch]                gh/williamwen42/354/base -> origin/gh/williamwen42/354/base
2025-12-04T11:11:09.6387020Z  * [new branch]                gh/williamwen42/354/head -> origin/gh/williamwen42/354/head
2025-12-04T11:11:09.6387099Z  * [new branch]                gh/williamwen42/354/orig -> origin/gh/williamwen42/354/orig
2025-12-04T11:11:09.6387181Z  * [new branch]                gh/williamwen42/355/base -> origin/gh/williamwen42/355/base
2025-12-04T11:11:09.6387262Z  * [new branch]                gh/williamwen42/355/head -> origin/gh/williamwen42/355/head
2025-12-04T11:11:09.6387341Z  * [new branch]                gh/williamwen42/355/orig -> origin/gh/williamwen42/355/orig
2025-12-04T11:11:09.6387426Z  * [new branch]                gh/williamwen42/356/base -> origin/gh/williamwen42/356/base
2025-12-04T11:11:09.6387504Z  * [new branch]                gh/williamwen42/356/head -> origin/gh/williamwen42/356/head
2025-12-04T11:11:09.6387584Z  * [new branch]                gh/williamwen42/356/orig -> origin/gh/williamwen42/356/orig
2025-12-04T11:11:09.6387670Z  * [new branch]                gh/williamwen42/357/base -> origin/gh/williamwen42/357/base
2025-12-04T11:11:09.6387752Z  * [new branch]                gh/williamwen42/357/head -> origin/gh/williamwen42/357/head
2025-12-04T11:11:09.6387832Z  * [new branch]                gh/williamwen42/357/orig -> origin/gh/williamwen42/357/orig
2025-12-04T11:11:09.6387916Z  * [new branch]                gh/williamwen42/358/base -> origin/gh/williamwen42/358/base
2025-12-04T11:11:09.6387997Z  * [new branch]                gh/williamwen42/358/head -> origin/gh/williamwen42/358/head
2025-12-04T11:11:09.6388077Z  * [new branch]                gh/williamwen42/358/orig -> origin/gh/williamwen42/358/orig
2025-12-04T11:11:09.6388193Z  * [new branch]                gh/xmfan/169/base       -> origin/gh/xmfan/169/base
2025-12-04T11:11:09.6388266Z  * [new branch]                gh/xmfan/169/head       -> origin/gh/xmfan/169/head
2025-12-04T11:11:09.6388336Z  * [new branch]                gh/xmfan/170/base       -> origin/gh/xmfan/170/base
2025-12-04T11:11:09.6388441Z  * [new branch]                gh/xmfan/170/head       -> origin/gh/xmfan/170/head
2025-12-04T11:11:09.6388510Z  * [new branch]                gh/xmfan/274/base       -> origin/gh/xmfan/274/base
2025-12-04T11:11:09.6388577Z  * [new branch]                gh/xmfan/274/head       -> origin/gh/xmfan/274/head
2025-12-04T11:11:09.6388674Z  * [new branch]                gh/xmfan/274/orig       -> origin/gh/xmfan/274/orig
2025-12-04T11:11:09.6388742Z  * [new branch]                gh/xmfan/277/base       -> origin/gh/xmfan/277/base
2025-12-04T11:11:09.6388812Z  * [new branch]                gh/xmfan/277/head       -> origin/gh/xmfan/277/head
2025-12-04T11:11:09.6388879Z  * [new branch]                gh/xmfan/277/orig       -> origin/gh/xmfan/277/orig
2025-12-04T11:11:09.6388947Z  * [new branch]                gh/xmfan/301/base       -> origin/gh/xmfan/301/base
2025-12-04T11:11:09.6389020Z  * [new branch]                gh/xmfan/301/head       -> origin/gh/xmfan/301/head
2025-12-04T11:11:09.6389088Z  * [new branch]                gh/xmfan/301/orig       -> origin/gh/xmfan/301/orig
2025-12-04T11:11:09.6389161Z  * [new branch]                gh/xmfan/304/base       -> origin/gh/xmfan/304/base
2025-12-04T11:11:09.6389237Z  * [new branch]                gh/xmfan/304/head       -> origin/gh/xmfan/304/head
2025-12-04T11:11:09.6389310Z  * [new branch]                gh/xmfan/304/orig       -> origin/gh/xmfan/304/orig
2025-12-04T11:11:09.6389377Z  * [new branch]                gh/xmfan/309/base       -> origin/gh/xmfan/309/base
2025-12-04T11:11:09.6389447Z  * [new branch]                gh/xmfan/309/head       -> origin/gh/xmfan/309/head
2025-12-04T11:11:09.6389514Z  * [new branch]                gh/xmfan/309/orig       -> origin/gh/xmfan/309/orig
2025-12-04T11:11:09.6389581Z  * [new branch]                gh/xmfan/310/base       -> origin/gh/xmfan/310/base
2025-12-04T11:11:09.6389651Z  * [new branch]                gh/xmfan/310/head       -> origin/gh/xmfan/310/head
2025-12-04T11:11:09.6389719Z  * [new branch]                gh/xmfan/310/orig       -> origin/gh/xmfan/310/orig
2025-12-04T11:11:09.6389789Z  * [new branch]                gh/xmfan/311/base       -> origin/gh/xmfan/311/base
2025-12-04T11:11:09.6389861Z  * [new branch]                gh/xmfan/311/head       -> origin/gh/xmfan/311/head
2025-12-04T11:11:09.6389931Z  * [new branch]                gh/xmfan/311/orig       -> origin/gh/xmfan/311/orig
2025-12-04T11:11:09.6389998Z  * [new branch]                gh/xmfan/312/base       -> origin/gh/xmfan/312/base
2025-12-04T11:11:09.6390069Z  * [new branch]                gh/xmfan/312/head       -> origin/gh/xmfan/312/head
2025-12-04T11:11:09.6390136Z  * [new branch]                gh/xmfan/312/orig       -> origin/gh/xmfan/312/orig
2025-12-04T11:11:09.6390205Z  * [new branch]                gh/xmfan/313/base       -> origin/gh/xmfan/313/base
2025-12-04T11:11:09.6390275Z  * [new branch]                gh/xmfan/313/head       -> origin/gh/xmfan/313/head
2025-12-04T11:11:09.6390343Z  * [new branch]                gh/xmfan/313/orig       -> origin/gh/xmfan/313/orig
2025-12-04T11:11:09.6390424Z  * [new branch]                gh/xuanzhang816/27/base -> origin/gh/xuanzhang816/27/base
2025-12-04T11:11:09.6390508Z  * [new branch]                gh/xuanzhang816/27/head -> origin/gh/xuanzhang816/27/head
2025-12-04T11:11:09.6390591Z  * [new branch]                gh/xuanzhang816/27/orig -> origin/gh/xuanzhang816/27/orig
2025-12-04T11:11:09.6390673Z  * [new branch]                gh/xuanzhang816/32/base -> origin/gh/xuanzhang816/32/base
2025-12-04T11:11:09.6390751Z  * [new branch]                gh/xuanzhang816/32/head -> origin/gh/xuanzhang816/32/head
2025-12-04T11:11:09.6390829Z  * [new branch]                gh/xuanzhang816/32/orig -> origin/gh/xuanzhang816/32/orig
2025-12-04T11:11:09.6390909Z  * [new branch]                gh/xuanzhang816/33/base -> origin/gh/xuanzhang816/33/base
2025-12-04T11:11:09.6390988Z  * [new branch]                gh/xuanzhang816/33/head -> origin/gh/xuanzhang816/33/head
2025-12-04T11:11:09.6391065Z  * [new branch]                gh/xuanzhang816/33/orig -> origin/gh/xuanzhang816/33/orig
2025-12-04T11:11:09.6391174Z  * [new branch]                gh/xuanzhang816/34/base -> origin/gh/xuanzhang816/34/base
2025-12-04T11:11:09.6391250Z  * [new branch]                gh/xuanzhang816/34/head -> origin/gh/xuanzhang816/34/head
2025-12-04T11:11:09.6391349Z  * [new branch]                gh/xuanzhang816/34/orig -> origin/gh/xuanzhang816/34/orig
2025-12-04T11:11:09.6391428Z  * [new branch]                gh/xuanzhang816/35/base -> origin/gh/xuanzhang816/35/base
2025-12-04T11:11:09.6391507Z  * [new branch]                gh/xuanzhang816/35/head -> origin/gh/xuanzhang816/35/head
2025-12-04T11:11:09.6391584Z  * [new branch]                gh/xuanzhang816/35/orig -> origin/gh/xuanzhang816/35/orig
2025-12-04T11:11:09.6391662Z  * [new branch]                gh/yanbing-j/11/base    -> origin/gh/yanbing-j/11/base
2025-12-04T11:11:09.6391736Z  * [new branch]                gh/yanbing-j/11/head    -> origin/gh/yanbing-j/11/head
2025-12-04T11:11:09.6391812Z  * [new branch]                gh/yanbing-j/11/orig    -> origin/gh/yanbing-j/11/orig
2025-12-04T11:11:09.6391887Z  * [new branch]                gh/yanbing-j/12/base    -> origin/gh/yanbing-j/12/base
2025-12-04T11:11:09.6391959Z  * [new branch]                gh/yanbing-j/12/head    -> origin/gh/yanbing-j/12/head
2025-12-04T11:11:09.6392033Z  * [new branch]                gh/yanbing-j/12/orig    -> origin/gh/yanbing-j/12/orig
2025-12-04T11:11:09.6392110Z  * [new branch]                gh/yanbing-j/13/base    -> origin/gh/yanbing-j/13/base
2025-12-04T11:11:09.6392185Z  * [new branch]                gh/yanbing-j/13/head    -> origin/gh/yanbing-j/13/head
2025-12-04T11:11:09.6392260Z  * [new branch]                gh/yanbing-j/13/orig    -> origin/gh/yanbing-j/13/orig
2025-12-04T11:11:09.6392334Z  * [new branch]                gh/yanbing-j/14/base    -> origin/gh/yanbing-j/14/base
2025-12-04T11:11:09.6392408Z  * [new branch]                gh/yanbing-j/14/head    -> origin/gh/yanbing-j/14/head
2025-12-04T11:11:09.6392485Z  * [new branch]                gh/yanbing-j/14/orig    -> origin/gh/yanbing-j/14/orig
2025-12-04T11:11:09.6392556Z  * [new branch]                gh/yanbing-j/15/base    -> origin/gh/yanbing-j/15/base
2025-12-04T11:11:09.6392627Z  * [new branch]                gh/yanbing-j/15/head    -> origin/gh/yanbing-j/15/head
2025-12-04T11:11:09.6392704Z  * [new branch]                gh/yanbing-j/15/orig    -> origin/gh/yanbing-j/15/orig
2025-12-04T11:11:09.6392776Z  * [new branch]                gh/yanbing-j/18/base    -> origin/gh/yanbing-j/18/base
2025-12-04T11:11:09.6392849Z  * [new branch]                gh/yanbing-j/18/head    -> origin/gh/yanbing-j/18/head
2025-12-04T11:11:09.6392923Z  * [new branch]                gh/yanbing-j/18/orig    -> origin/gh/yanbing-j/18/orig
2025-12-04T11:11:09.6392994Z  * [new branch]                gh/yanbing-j/19/base    -> origin/gh/yanbing-j/19/base
2025-12-04T11:11:09.6393065Z  * [new branch]                gh/yanbing-j/19/head    -> origin/gh/yanbing-j/19/head
2025-12-04T11:11:09.6393143Z  * [new branch]                gh/yanbing-j/19/orig    -> origin/gh/yanbing-j/19/orig
2025-12-04T11:11:09.6393217Z  * [new branch]                gh/yanbing-j/20/base    -> origin/gh/yanbing-j/20/base
2025-12-04T11:11:09.6393290Z  * [new branch]                gh/yanbing-j/20/head    -> origin/gh/yanbing-j/20/head
2025-12-04T11:11:09.6393365Z  * [new branch]                gh/yanbing-j/20/orig    -> origin/gh/yanbing-j/20/orig
2025-12-04T11:11:09.6393436Z  * [new branch]                gh/yanbing-j/21/base    -> origin/gh/yanbing-j/21/base
2025-12-04T11:11:09.6393507Z  * [new branch]                gh/yanbing-j/21/head    -> origin/gh/yanbing-j/21/head
2025-12-04T11:11:09.6393580Z  * [new branch]                gh/yanbing-j/22/base    -> origin/gh/yanbing-j/22/base
2025-12-04T11:11:09.6393651Z  * [new branch]                gh/yanbing-j/22/head    -> origin/gh/yanbing-j/22/head
2025-12-04T11:11:09.6393724Z  * [new branch]                gh/yanbing-j/22/orig    -> origin/gh/yanbing-j/22/orig
2025-12-04T11:11:09.6393816Z  * [new branch]                gh/yanbing-j/23/base    -> origin/gh/yanbing-j/23/base
2025-12-04T11:11:09.6393887Z  * [new branch]                gh/yanbing-j/23/head    -> origin/gh/yanbing-j/23/head
2025-12-04T11:11:09.6393962Z  * [new branch]                gh/yanbing-j/23/orig    -> origin/gh/yanbing-j/23/orig
2025-12-04T11:11:09.6394082Z  * [new branch]                gh/yanbing-j/24/base    -> origin/gh/yanbing-j/24/base
2025-12-04T11:11:09.6394152Z  * [new branch]                gh/yanbing-j/24/head    -> origin/gh/yanbing-j/24/head
2025-12-04T11:11:09.6394226Z  * [new branch]                gh/yanbing-j/24/orig    -> origin/gh/yanbing-j/24/orig
2025-12-04T11:11:09.6394297Z  * [new branch]                gh/yanbing-j/25/base    -> origin/gh/yanbing-j/25/base
2025-12-04T11:11:09.6394369Z  * [new branch]                gh/yanbing-j/25/head    -> origin/gh/yanbing-j/25/head
2025-12-04T11:11:09.6394444Z  * [new branch]                gh/yanbing-j/25/orig    -> origin/gh/yanbing-j/25/orig
2025-12-04T11:11:09.6394518Z  * [new branch]                gh/yanbing-j/26/base    -> origin/gh/yanbing-j/26/base
2025-12-04T11:11:09.6394589Z  * [new branch]                gh/yanbing-j/26/head    -> origin/gh/yanbing-j/26/head
2025-12-04T11:11:09.6394665Z  * [new branch]                gh/yanbing-j/26/orig    -> origin/gh/yanbing-j/26/orig
2025-12-04T11:11:09.6394747Z  * [new branch]                gh/yang-yu-hang/1/base  -> origin/gh/yang-yu-hang/1/base
2025-12-04T11:11:09.6394825Z  * [new branch]                gh/yang-yu-hang/1/head  -> origin/gh/yang-yu-hang/1/head
2025-12-04T11:11:09.6394906Z  * [new branch]                gh/yang-yu-hang/1/orig  -> origin/gh/yang-yu-hang/1/orig
2025-12-04T11:11:09.6394982Z  * [new branch]                gh/yang-yu-hang/2/base  -> origin/gh/yang-yu-hang/2/base
2025-12-04T11:11:09.6395058Z  * [new branch]                gh/yang-yu-hang/2/head  -> origin/gh/yang-yu-hang/2/head
2025-12-04T11:11:09.6395136Z  * [new branch]                gh/yang-yu-hang/2/orig  -> origin/gh/yang-yu-hang/2/orig
2025-12-04T11:11:09.6395211Z  * [new branch]                gh/yang-yu-hang/3/base  -> origin/gh/yang-yu-hang/3/base
2025-12-04T11:11:09.6395290Z  * [new branch]                gh/yang-yu-hang/3/head  -> origin/gh/yang-yu-hang/3/head
2025-12-04T11:11:09.6395367Z  * [new branch]                gh/yang-yu-hang/3/orig  -> origin/gh/yang-yu-hang/3/orig
2025-12-04T11:11:09.6395443Z  * [new branch]                gh/yangw-dev/12/base    -> origin/gh/yangw-dev/12/base
2025-12-04T11:11:09.6395522Z  * [new branch]                gh/yangw-dev/12/head    -> origin/gh/yangw-dev/12/head
2025-12-04T11:11:09.6395597Z  * [new branch]                gh/yangw-dev/12/orig    -> origin/gh/yangw-dev/12/orig
2025-12-04T11:11:09.6395672Z  * [new branch]                gh/yangw-dev/13/base    -> origin/gh/yangw-dev/13/base
2025-12-04T11:11:09.6395748Z  * [new branch]                gh/yangw-dev/13/head    -> origin/gh/yangw-dev/13/head
2025-12-04T11:11:09.6395821Z  * [new branch]                gh/yangw-dev/13/orig    -> origin/gh/yangw-dev/13/orig
2025-12-04T11:11:09.6395896Z  * [new branch]                gh/yangw-dev/14/base    -> origin/gh/yangw-dev/14/base
2025-12-04T11:11:09.6395969Z  * [new branch]                gh/yangw-dev/14/head    -> origin/gh/yangw-dev/14/head
2025-12-04T11:11:09.6396042Z  * [new branch]                gh/yangw-dev/14/orig    -> origin/gh/yangw-dev/14/orig
2025-12-04T11:11:09.6396114Z  * [new branch]                gh/yangw-dev/15/base    -> origin/gh/yangw-dev/15/base
2025-12-04T11:11:09.6396188Z  * [new branch]                gh/yangw-dev/15/head    -> origin/gh/yangw-dev/15/head
2025-12-04T11:11:09.6396260Z  * [new branch]                gh/yangw-dev/15/orig    -> origin/gh/yangw-dev/15/orig
2025-12-04T11:11:09.6396332Z  * [new branch]                gh/yangw-dev/19/base    -> origin/gh/yangw-dev/19/base
2025-12-04T11:11:09.6396406Z  * [new branch]                gh/yangw-dev/19/head    -> origin/gh/yangw-dev/19/head
2025-12-04T11:11:09.6396477Z  * [new branch]                gh/yangw-dev/19/orig    -> origin/gh/yangw-dev/19/orig
2025-12-04T11:11:09.6396578Z  * [new branch]                gh/yangw-dev/26/base    -> origin/gh/yangw-dev/26/base
2025-12-04T11:11:09.6396654Z  * [new branch]                gh/yangw-dev/26/head    -> origin/gh/yangw-dev/26/head
2025-12-04T11:11:09.6396745Z  * [new branch]                gh/yangw-dev/26/orig    -> origin/gh/yangw-dev/26/orig
2025-12-04T11:11:09.6396818Z  * [new branch]                gh/yangw-dev/27/base    -> origin/gh/yangw-dev/27/base
2025-12-04T11:11:09.6396895Z  * [new branch]                gh/yangw-dev/27/head    -> origin/gh/yangw-dev/27/head
2025-12-04T11:11:09.6396968Z  * [new branch]                gh/yangw-dev/27/orig    -> origin/gh/yangw-dev/27/orig
2025-12-04T11:11:09.6397043Z  * [new branch]                gh/ydwu4/292/base       -> origin/gh/ydwu4/292/base
2025-12-04T11:11:09.6397113Z  * [new branch]                gh/ydwu4/292/head       -> origin/gh/ydwu4/292/head
2025-12-04T11:11:09.6397184Z  * [new branch]                gh/ydwu4/292/orig       -> origin/gh/ydwu4/292/orig
2025-12-04T11:11:09.6397253Z  * [new branch]                gh/ydwu4/294/base       -> origin/gh/ydwu4/294/base
2025-12-04T11:11:09.6397320Z  * [new branch]                gh/ydwu4/294/head       -> origin/gh/ydwu4/294/head
2025-12-04T11:11:09.6397389Z  * [new branch]                gh/ydwu4/294/orig       -> origin/gh/ydwu4/294/orig
2025-12-04T11:11:09.6397459Z  * [new branch]                gh/ydwu4/295/base       -> origin/gh/ydwu4/295/base
2025-12-04T11:11:09.6397526Z  * [new branch]                gh/ydwu4/295/head       -> origin/gh/ydwu4/295/head
2025-12-04T11:11:09.6397594Z  * [new branch]                gh/ydwu4/295/orig       -> origin/gh/ydwu4/295/orig
2025-12-04T11:11:09.6397667Z  * [new branch]                gh/ydwu4/296/base       -> origin/gh/ydwu4/296/base
2025-12-04T11:11:09.6397734Z  * [new branch]                gh/ydwu4/296/head       -> origin/gh/ydwu4/296/head
2025-12-04T11:11:09.6397801Z  * [new branch]                gh/ydwu4/296/orig       -> origin/gh/ydwu4/296/orig
2025-12-04T11:11:09.6397871Z  * [new branch]                gh/ydwu4/306/base       -> origin/gh/ydwu4/306/base
2025-12-04T11:11:09.6397939Z  * [new branch]                gh/ydwu4/306/head       -> origin/gh/ydwu4/306/head
2025-12-04T11:11:09.6398007Z  * [new branch]                gh/ydwu4/306/orig       -> origin/gh/ydwu4/306/orig
2025-12-04T11:11:09.6398077Z  * [new branch]                gh/ydwu4/312/base       -> origin/gh/ydwu4/312/base
2025-12-04T11:11:09.6398177Z  * [new branch]                gh/ydwu4/312/head       -> origin/gh/ydwu4/312/head
2025-12-04T11:11:09.6398246Z  * [new branch]                gh/ydwu4/312/orig       -> origin/gh/ydwu4/312/orig
2025-12-04T11:11:09.6398319Z  * [new branch]                gh/ydwu4/322/base       -> origin/gh/ydwu4/322/base
2025-12-04T11:11:09.6398390Z  * [new branch]                gh/ydwu4/322/head       -> origin/gh/ydwu4/322/head
2025-12-04T11:11:09.6398459Z  * [new branch]                gh/ydwu4/322/orig       -> origin/gh/ydwu4/322/orig
2025-12-04T11:11:09.6398533Z  * [new branch]                gh/ydwu4/327/base       -> origin/gh/ydwu4/327/base
2025-12-04T11:11:09.6398603Z  * [new branch]                gh/ydwu4/327/head       -> origin/gh/ydwu4/327/head
2025-12-04T11:11:09.6398677Z  * [new branch]                gh/ydwu4/327/orig       -> origin/gh/ydwu4/327/orig
2025-12-04T11:11:09.6398747Z  * [new branch]                gh/ydwu4/328/base       -> origin/gh/ydwu4/328/base
2025-12-04T11:11:09.6398816Z  * [new branch]                gh/ydwu4/328/head       -> origin/gh/ydwu4/328/head
2025-12-04T11:11:09.6398890Z  * [new branch]                gh/ydwu4/328/orig       -> origin/gh/ydwu4/328/orig
2025-12-04T11:11:09.6398960Z  * [new branch]                gh/ydwu4/329/base       -> origin/gh/ydwu4/329/base
2025-12-04T11:11:09.6399028Z  * [new branch]                gh/ydwu4/329/head       -> origin/gh/ydwu4/329/head
2025-12-04T11:11:09.6399101Z  * [new branch]                gh/ydwu4/329/orig       -> origin/gh/ydwu4/329/orig
2025-12-04T11:11:09.6399203Z  * [new branch]                gh/ydwu4/330/base       -> origin/gh/ydwu4/330/base
2025-12-04T11:11:09.6399273Z  * [new branch]                gh/ydwu4/330/head       -> origin/gh/ydwu4/330/head
2025-12-04T11:11:09.6399374Z  * [new branch]                gh/ydwu4/330/orig       -> origin/gh/ydwu4/330/orig
2025-12-04T11:11:09.6399443Z  * [new branch]                gh/ydwu4/331/base       -> origin/gh/ydwu4/331/base
2025-12-04T11:11:09.6399512Z  * [new branch]                gh/ydwu4/331/head       -> origin/gh/ydwu4/331/head
2025-12-04T11:11:09.6399586Z  * [new branch]                gh/ydwu4/331/orig       -> origin/gh/ydwu4/331/orig
2025-12-04T11:11:09.6409244Z  * [new branch]                gh/ydwu4/332/base       -> origin/gh/ydwu4/332/base
2025-12-04T11:11:09.6409325Z  * [new branch]                gh/ydwu4/332/head       -> origin/gh/ydwu4/332/head
2025-12-04T11:11:09.6409397Z  * [new branch]                gh/ydwu4/332/orig       -> origin/gh/ydwu4/332/orig
2025-12-04T11:11:09.6409474Z  * [new branch]                gh/ydwu4/333/base       -> origin/gh/ydwu4/333/base
2025-12-04T11:11:09.6409541Z  * [new branch]                gh/ydwu4/333/head       -> origin/gh/ydwu4/333/head
2025-12-04T11:11:09.6409616Z  * [new branch]                gh/ydwu4/333/orig       -> origin/gh/ydwu4/333/orig
2025-12-04T11:11:09.6409694Z  * [new branch]                gh/ydwu4/334/base       -> origin/gh/ydwu4/334/base
2025-12-04T11:11:09.6409765Z  * [new branch]                gh/ydwu4/334/head       -> origin/gh/ydwu4/334/head
2025-12-04T11:11:09.6409833Z  * [new branch]                gh/ydwu4/334/orig       -> origin/gh/ydwu4/334/orig
2025-12-04T11:11:09.6409900Z  * [new branch]                gh/ydwu4/335/base       -> origin/gh/ydwu4/335/base
2025-12-04T11:11:09.6409972Z  * [new branch]                gh/ydwu4/335/head       -> origin/gh/ydwu4/335/head
2025-12-04T11:11:09.6410040Z  * [new branch]                gh/ydwu4/335/orig       -> origin/gh/ydwu4/335/orig
2025-12-04T11:11:09.6410109Z  * [new branch]                gh/ydwu4/337/base       -> origin/gh/ydwu4/337/base
2025-12-04T11:11:09.6410185Z  * [new branch]                gh/ydwu4/337/head       -> origin/gh/ydwu4/337/head
2025-12-04T11:11:09.6410254Z  * [new branch]                gh/ydwu4/337/orig       -> origin/gh/ydwu4/337/orig
2025-12-04T11:11:09.6410324Z  * [new branch]                gh/ydwu4/339/base       -> origin/gh/ydwu4/339/base
2025-12-04T11:11:09.6410399Z  * [new branch]                gh/ydwu4/339/head       -> origin/gh/ydwu4/339/head
2025-12-04T11:11:09.6410465Z  * [new branch]                gh/ydwu4/339/orig       -> origin/gh/ydwu4/339/orig
2025-12-04T11:11:09.6410536Z  * [new branch]                gh/yf225/133/base       -> origin/gh/yf225/133/base
2025-12-04T11:11:09.6410608Z  * [new branch]                gh/yf225/133/head       -> origin/gh/yf225/133/head
2025-12-04T11:11:09.6410677Z  * [new branch]                gh/yf225/93/base        -> origin/gh/yf225/93/base
2025-12-04T11:11:09.6410748Z  * [new branch]                gh/yf225/93/head        -> origin/gh/yf225/93/head
2025-12-04T11:11:09.6410829Z  * [new branch]                gh/yifuwang/152/base    -> origin/gh/yifuwang/152/base
2025-12-04T11:11:09.6410905Z  * [new branch]                gh/yifuwang/152/head    -> origin/gh/yifuwang/152/head
2025-12-04T11:11:09.6410982Z  * [new branch]                gh/yifuwang/152/orig    -> origin/gh/yifuwang/152/orig
2025-12-04T11:11:09.6411061Z  * [new branch]                gh/yifuwang/195/base    -> origin/gh/yifuwang/195/base
2025-12-04T11:11:09.6411138Z  * [new branch]                gh/yifuwang/195/head    -> origin/gh/yifuwang/195/head
2025-12-04T11:11:09.6411211Z  * [new branch]                gh/yifuwang/195/orig    -> origin/gh/yifuwang/195/orig
2025-12-04T11:11:09.6411294Z  * [new branch]                gh/yiming0416/1/base    -> origin/gh/yiming0416/1/base
2025-12-04T11:11:09.6411368Z  * [new branch]                gh/yiming0416/1/head    -> origin/gh/yiming0416/1/head
2025-12-04T11:11:09.6411492Z  * [new branch]                gh/yiming0416/2/base    -> origin/gh/yiming0416/2/base
2025-12-04T11:11:09.6411566Z  * [new branch]                gh/yiming0416/2/head    -> origin/gh/yiming0416/2/head
2025-12-04T11:11:09.6411643Z  * [new branch]                gh/yushangdi/1/base     -> origin/gh/yushangdi/1/base
2025-12-04T11:11:09.6411753Z  * [new branch]                gh/yushangdi/1/head     -> origin/gh/yushangdi/1/head
2025-12-04T11:11:09.6411830Z  * [new branch]                gh/yushangdi/10/base    -> origin/gh/yushangdi/10/base
2025-12-04T11:11:09.6411906Z  * [new branch]                gh/yushangdi/10/head    -> origin/gh/yushangdi/10/head
2025-12-04T11:11:09.6411985Z  * [new branch]                gh/yushangdi/10/orig    -> origin/gh/yushangdi/10/orig
2025-12-04T11:11:09.6412060Z  * [new branch]                gh/yushangdi/11/base    -> origin/gh/yushangdi/11/base
2025-12-04T11:11:09.6412134Z  * [new branch]                gh/yushangdi/11/head    -> origin/gh/yushangdi/11/head
2025-12-04T11:11:09.6412216Z  * [new branch]                gh/yushangdi/11/orig    -> origin/gh/yushangdi/11/orig
2025-12-04T11:11:09.6412291Z  * [new branch]                gh/yushangdi/2/base     -> origin/gh/yushangdi/2/base
2025-12-04T11:11:09.6412366Z  * [new branch]                gh/yushangdi/2/head     -> origin/gh/yushangdi/2/head
2025-12-04T11:11:09.6412446Z  * [new branch]                gh/yushangdi/7/base     -> origin/gh/yushangdi/7/base
2025-12-04T11:11:09.6412520Z  * [new branch]                gh/yushangdi/7/head     -> origin/gh/yushangdi/7/head
2025-12-04T11:11:09.6412593Z  * [new branch]                gh/yushangdi/7/orig     -> origin/gh/yushangdi/7/orig
2025-12-04T11:11:09.6412674Z  * [new branch]                gh/yushangdi/8/base     -> origin/gh/yushangdi/8/base
2025-12-04T11:11:09.6412748Z  * [new branch]                gh/yushangdi/8/head     -> origin/gh/yushangdi/8/head
2025-12-04T11:11:09.6412822Z  * [new branch]                gh/yushangdi/8/orig     -> origin/gh/yushangdi/8/orig
2025-12-04T11:11:09.6412902Z  * [new branch]                gh/yushangdi/9/base     -> origin/gh/yushangdi/9/base
2025-12-04T11:11:09.6412977Z  * [new branch]                gh/yushangdi/9/head     -> origin/gh/yushangdi/9/head
2025-12-04T11:11:09.6413050Z  * [new branch]                gh/yushangdi/9/orig     -> origin/gh/yushangdi/9/orig
2025-12-04T11:11:09.6413130Z  * [new branch]                gh/zklaus/19/base       -> origin/gh/zklaus/19/base
2025-12-04T11:11:09.6413203Z  * [new branch]                gh/zklaus/19/head       -> origin/gh/zklaus/19/head
2025-12-04T11:11:09.6413279Z  * [new branch]                gh/zklaus/19/orig       -> origin/gh/zklaus/19/orig
2025-12-04T11:11:09.6413348Z  * [new branch]                gh/zklaus/20/base       -> origin/gh/zklaus/20/base
2025-12-04T11:11:09.6413417Z  * [new branch]                gh/zklaus/20/head       -> origin/gh/zklaus/20/head
2025-12-04T11:11:09.6413491Z  * [new branch]                gh/zklaus/20/orig       -> origin/gh/zklaus/20/orig
2025-12-04T11:11:09.6413561Z  * [new branch]                gh/zklaus/21/base       -> origin/gh/zklaus/21/base
2025-12-04T11:11:09.6413631Z  * [new branch]                gh/zklaus/21/head       -> origin/gh/zklaus/21/head
2025-12-04T11:11:09.6413706Z  * [new branch]                gh/zklaus/21/orig       -> origin/gh/zklaus/21/orig
2025-12-04T11:11:09.6413778Z  * [new branch]                gh/zklaus/22/base       -> origin/gh/zklaus/22/base
2025-12-04T11:11:09.6413848Z  * [new branch]                gh/zklaus/22/head       -> origin/gh/zklaus/22/head
2025-12-04T11:11:09.6413924Z  * [new branch]                gh/zklaus/22/orig       -> origin/gh/zklaus/22/orig
2025-12-04T11:11:09.6413992Z  * [new branch]                gh/zklaus/23/base       -> origin/gh/zklaus/23/base
2025-12-04T11:11:09.6414062Z  * [new branch]                gh/zklaus/23/head       -> origin/gh/zklaus/23/head
2025-12-04T11:11:09.6414134Z  * [new branch]                gh/zklaus/23/orig       -> origin/gh/zklaus/23/orig
2025-12-04T11:11:09.6414224Z  * [new branch]                gh/zklaus/24/base       -> origin/gh/zklaus/24/base
2025-12-04T11:11:09.6414295Z  * [new branch]                gh/zklaus/24/head       -> origin/gh/zklaus/24/head
2025-12-04T11:11:09.6414371Z  * [new branch]                gh/zklaus/24/orig       -> origin/gh/zklaus/24/orig
2025-12-04T11:11:09.6414470Z  * [new branch]                gh/zou3519/1197/base    -> origin/gh/zou3519/1197/base
2025-12-04T11:11:09.6414544Z  * [new branch]                gh/zou3519/1197/head    -> origin/gh/zou3519/1197/head
2025-12-04T11:11:09.6414622Z  * [new branch]                gh/zou3519/1197/orig    -> origin/gh/zou3519/1197/orig
2025-12-04T11:11:09.6414693Z  * [new branch]                gh/zou3519/1199/base    -> origin/gh/zou3519/1199/base
2025-12-04T11:11:09.6414767Z  * [new branch]                gh/zou3519/1199/head    -> origin/gh/zou3519/1199/head
2025-12-04T11:11:09.6414846Z  * [new branch]                gh/zou3519/1199/orig    -> origin/gh/zou3519/1199/orig
2025-12-04T11:11:09.6414922Z  * [new branch]                gh/zou3519/1200/base    -> origin/gh/zou3519/1200/base
2025-12-04T11:11:09.6415002Z  * [new branch]                gh/zou3519/1200/head    -> origin/gh/zou3519/1200/head
2025-12-04T11:11:09.6415077Z  * [new branch]                gh/zou3519/1200/orig    -> origin/gh/zou3519/1200/orig
2025-12-04T11:11:09.6415152Z  * [new branch]                gh/zou3519/1201/base    -> origin/gh/zou3519/1201/base
2025-12-04T11:11:09.6415226Z  * [new branch]                gh/zou3519/1201/head    -> origin/gh/zou3519/1201/head
2025-12-04T11:11:09.6415299Z  * [new branch]                gh/zou3519/1201/orig    -> origin/gh/zou3519/1201/orig
2025-12-04T11:11:09.6415372Z  * [new branch]                gh/zou3519/1202/base    -> origin/gh/zou3519/1202/base
2025-12-04T11:11:09.6415449Z  * [new branch]                gh/zou3519/1202/head    -> origin/gh/zou3519/1202/head
2025-12-04T11:11:09.6415522Z  * [new branch]                gh/zou3519/1202/orig    -> origin/gh/zou3519/1202/orig
2025-12-04T11:11:09.6415599Z  * [new branch]                gh/zpcore/1/base        -> origin/gh/zpcore/1/base
2025-12-04T11:11:09.6415676Z  * [new branch]                gh/zpcore/1/head        -> origin/gh/zpcore/1/head
2025-12-04T11:11:09.6415747Z  * [new branch]                gh/zpcore/11/base       -> origin/gh/zpcore/11/base
2025-12-04T11:11:09.6415818Z  * [new branch]                gh/zpcore/11/head       -> origin/gh/zpcore/11/head
2025-12-04T11:11:09.6415893Z  * [new branch]                gh/zpcore/11/orig       -> origin/gh/zpcore/11/orig
2025-12-04T11:11:09.6415965Z  * [new branch]                gh/zpcore/12/base       -> origin/gh/zpcore/12/base
2025-12-04T11:11:09.6416034Z  * [new branch]                gh/zpcore/12/head       -> origin/gh/zpcore/12/head
2025-12-04T11:11:09.6416106Z  * [new branch]                gh/zpcore/12/orig       -> origin/gh/zpcore/12/orig
2025-12-04T11:11:09.6416176Z  * [new branch]                gh/zpcore/13/base       -> origin/gh/zpcore/13/base
2025-12-04T11:11:09.6416246Z  * [new branch]                gh/zpcore/13/head       -> origin/gh/zpcore/13/head
2025-12-04T11:11:09.6416320Z  * [new branch]                gh/zpcore/13/orig       -> origin/gh/zpcore/13/orig
2025-12-04T11:11:09.6416392Z  * [new branch]                gh/zpcore/14/base       -> origin/gh/zpcore/14/base
2025-12-04T11:11:09.6416463Z  * [new branch]                gh/zpcore/14/head       -> origin/gh/zpcore/14/head
2025-12-04T11:11:09.6416541Z  * [new branch]                gh/zpcore/14/orig       -> origin/gh/zpcore/14/orig
2025-12-04T11:11:09.6416612Z  * [new branch]                gh/zpcore/15/base       -> origin/gh/zpcore/15/base
2025-12-04T11:11:09.6416687Z  * [new branch]                gh/zpcore/15/head       -> origin/gh/zpcore/15/head
2025-12-04T11:11:09.6416756Z  * [new branch]                gh/zpcore/15/orig       -> origin/gh/zpcore/15/orig
2025-12-04T11:11:09.6416828Z  * [new branch]                gh/zpcore/2/base        -> origin/gh/zpcore/2/base
2025-12-04T11:11:09.6416900Z  * [new branch]                gh/zpcore/2/head        -> origin/gh/zpcore/2/head
2025-12-04T11:11:09.6416992Z  * [new branch]                gh/zpcore/21/base       -> origin/gh/zpcore/21/base
2025-12-04T11:11:09.6417064Z  * [new branch]                gh/zpcore/21/head       -> origin/gh/zpcore/21/head
2025-12-04T11:11:09.6417160Z  * [new branch]                gh/zpcore/21/orig       -> origin/gh/zpcore/21/orig
2025-12-04T11:11:09.6417230Z  * [new branch]                gh/zpcore/22/base       -> origin/gh/zpcore/22/base
2025-12-04T11:11:09.6417302Z  * [new branch]                gh/zpcore/22/head       -> origin/gh/zpcore/22/head
2025-12-04T11:11:09.6417377Z  * [new branch]                gh/zpcore/22/orig       -> origin/gh/zpcore/22/orig
2025-12-04T11:11:09.6417448Z  * [new branch]                gh/zpcore/23/base       -> origin/gh/zpcore/23/base
2025-12-04T11:11:09.6417517Z  * [new branch]                gh/zpcore/23/head       -> origin/gh/zpcore/23/head
2025-12-04T11:11:09.6417592Z  * [new branch]                gh/zpcore/23/orig       -> origin/gh/zpcore/23/orig
2025-12-04T11:11:09.6417663Z  * [new branch]                gh/zpcore/24/base       -> origin/gh/zpcore/24/base
2025-12-04T11:11:09.6417733Z  * [new branch]                gh/zpcore/24/head       -> origin/gh/zpcore/24/head
2025-12-04T11:11:09.6417809Z  * [new branch]                gh/zpcore/24/orig       -> origin/gh/zpcore/24/orig
2025-12-04T11:11:09.6417879Z  * [new branch]                gh/zpcore/25/base       -> origin/gh/zpcore/25/base
2025-12-04T11:11:09.6417950Z  * [new branch]                gh/zpcore/25/head       -> origin/gh/zpcore/25/head
2025-12-04T11:11:09.6418025Z  * [new branch]                gh/zpcore/25/orig       -> origin/gh/zpcore/25/orig
2025-12-04T11:11:09.6418095Z  * [new branch]                gh/zpcore/26/base       -> origin/gh/zpcore/26/base
2025-12-04T11:11:09.6418210Z  * [new branch]                gh/zpcore/26/head       -> origin/gh/zpcore/26/head
2025-12-04T11:11:09.6418291Z  * [new branch]                gh/zpcore/26/orig       -> origin/gh/zpcore/26/orig
2025-12-04T11:11:09.6418362Z  * [new branch]                gh/zpcore/27/base       -> origin/gh/zpcore/27/base
2025-12-04T11:11:09.6418439Z  * [new branch]                gh/zpcore/27/head       -> origin/gh/zpcore/27/head
2025-12-04T11:11:09.6418513Z  * [new branch]                gh/zpcore/27/orig       -> origin/gh/zpcore/27/orig
2025-12-04T11:11:09.6418583Z  * [new branch]                gh/zpcore/28/base       -> origin/gh/zpcore/28/base
2025-12-04T11:11:09.6418658Z  * [new branch]                gh/zpcore/28/head       -> origin/gh/zpcore/28/head
2025-12-04T11:11:09.6418726Z  * [new branch]                gh/zpcore/28/orig       -> origin/gh/zpcore/28/orig
2025-12-04T11:11:09.6418794Z  * [new branch]                gh/zpcore/3/base        -> origin/gh/zpcore/3/base
2025-12-04T11:11:09.6418871Z  * [new branch]                gh/zpcore/3/head        -> origin/gh/zpcore/3/head
2025-12-04T11:11:09.6418942Z  * [new branch]                gh/zpcore/4/base        -> origin/gh/zpcore/4/base
2025-12-04T11:11:09.6419013Z  * [new branch]                gh/zpcore/4/head        -> origin/gh/zpcore/4/head
2025-12-04T11:11:09.6419090Z  * [new branch]                gh/zpcore/5/base        -> origin/gh/zpcore/5/base
2025-12-04T11:11:09.6419160Z  * [new branch]                gh/zpcore/5/head        -> origin/gh/zpcore/5/head
2025-12-04T11:11:09.6419230Z  * [new branch]                gh/zpcore/6/base        -> origin/gh/zpcore/6/base
2025-12-04T11:11:09.6419306Z  * [new branch]                gh/zpcore/6/head        -> origin/gh/zpcore/6/head
2025-12-04T11:11:09.6419375Z  * [new branch]                gh/zpcore/7/base        -> origin/gh/zpcore/7/base
2025-12-04T11:11:09.6419443Z  * [new branch]                gh/zpcore/7/head        -> origin/gh/zpcore/7/head
2025-12-04T11:11:09.6419517Z  * [new branch]                gh/zpcore/8/base        -> origin/gh/zpcore/8/base
2025-12-04T11:11:09.6419586Z  * [new branch]                gh/zpcore/8/head        -> origin/gh/zpcore/8/head
2025-12-04T11:11:09.6419686Z  * [new branch]                google-main             -> origin/google-main
2025-12-04T11:11:09.6419781Z  * [new branch]                guangyey/external_stream -> origin/guangyey/external_stream
2025-12-04T11:11:09.6419858Z  * [new branch]                guangyey/test_2025      -> origin/guangyey/test_2025
2025-12-04T11:11:09.6420036Z  * [new branch]                guilhermeleobas/cherry-pick-55d87d9dfd9 -> origin/guilhermeleobas/cherry-pick-55d87d9dfd9
2025-12-04T11:11:09.6420163Z  * [new branch]                hameerabbasi/complex_tensor_subclass -> origin/hameerabbasi/complex_tensor_subclass
2025-12-04T11:11:09.6420307Z  * [new branch]                hameerabbasi/fix-ctensor-gradcheck-tests -> origin/hameerabbasi/fix-ctensor-gradcheck-tests
2025-12-04T11:11:09.6420423Z  * [new branch]                hameerabbasi/gradcheck-allclose -> origin/hameerabbasi/gradcheck-allclose
2025-12-04T11:11:09.6420491Z  * [new branch]                hc_baseline             -> origin/hc_baseline
2025-12-04T11:11:09.6420560Z  * [new branch]                hhh_rand                -> origin/hhh_rand
2025-12-04T11:11:09.6420628Z  * [new branch]                huba/f1                 -> origin/huba/f1
2025-12-04T11:11:09.6420821Z  * [new branch]                increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test -> origin/increase-timeout-linux-jammy-cuda12_8-py3_10-gcc11-test
2025-12-04T11:11:09.6420886Z  * [new branch]                inlining                -> origin/inlining
2025-12-04T11:11:09.6420961Z  * [new branch]                inlining-ezyang         -> origin/inlining-ezyang
2025-12-04T11:11:09.6421049Z  * [new branch]                install-torchao-0.13.0  -> origin/install-torchao-0.13.0
2025-12-04T11:11:09.6421234Z  * [new branch]                instrument-trunk-pull-linux-with-job-test-filters -> origin/instrument-trunk-pull-linux-with-job-test-filters
2025-12-04T11:11:09.6421313Z  * [new branch]                invoke-subgraph         -> origin/invoke-subgraph
2025-12-04T11:11:09.6421392Z  * [new branch]                issue#58739             -> origin/issue#58739
2025-12-04T11:11:09.6421477Z  * [new branch]                jainapurva-patch-1      -> origin/jainapurva-patch-1
2025-12-04T11:11:09.6421548Z  * [new branch]                jathu/o3                -> origin/jathu/o3
2025-12-04T11:11:09.6421614Z  * [new branch]                jathu/sve               -> origin/jathu/sve
2025-12-04T11:11:09.6421741Z  * [new branch]                jcaip/test-cusparselt-version-0.6.2 -> origin/jcaip/test-cusparselt-version-0.6.2
2025-12-04T11:11:09.6421852Z  * [new branch]                jcaip/update-cusparselt-0.6.2 -> origin/jcaip/update-cusparselt-0.6.2
2025-12-04T11:11:09.6421969Z  * [new branch]                jiannanWang/memorysnapshot_filter -> origin/jiannanWang/memorysnapshot_filter
2025-12-04T11:11:09.6422087Z  * [new branch]                jiannanWang/profilerstepwarning -> origin/jiannanWang/profilerstepwarning
2025-12-04T11:11:09.6422177Z  * [new branch]                jithunnair-amd-patch-1  -> origin/jithunnair-amd-patch-1
2025-12-04T11:11:09.6422268Z  * [new branch]                jithunnair-amd-patch-10 -> origin/jithunnair-amd-patch-10
2025-12-04T11:11:09.6422358Z  * [new branch]                jithunnair-amd-patch-2  -> origin/jithunnair-amd-patch-2
2025-12-04T11:11:09.6422442Z  * [new branch]                jithunnair-amd-patch-3  -> origin/jithunnair-amd-patch-3
2025-12-04T11:11:09.6422525Z  * [new branch]                jithunnair-amd-patch-4  -> origin/jithunnair-amd-patch-4
2025-12-04T11:11:09.6422613Z  * [new branch]                jithunnair-amd-patch-5  -> origin/jithunnair-amd-patch-5
2025-12-04T11:11:09.6422696Z  * [new branch]                jithunnair-amd-patch-6  -> origin/jithunnair-amd-patch-6
2025-12-04T11:11:09.6422778Z  * [new branch]                jithunnair-amd-patch-7  -> origin/jithunnair-amd-patch-7
2025-12-04T11:11:09.6422865Z  * [new branch]                jithunnair-amd-patch-8  -> origin/jithunnair-amd-patch-8
2025-12-04T11:11:09.6422947Z  * [new branch]                jithunnair-amd-patch-9  -> origin/jithunnair-amd-patch-9
2025-12-04T11:11:09.6423049Z  * [new branch]                justinchu/native-qdq    -> origin/justinchu/native-qdq
2025-12-04T11:11:09.6423131Z  * [new branch]                kainan666/xlf_debug     -> origin/kainan666/xlf_debug
2025-12-04T11:11:09.6423219Z  * [new branch]                kainan_test             -> origin/kainan_test
2025-12-04T11:11:09.6423302Z  * [new branch]                larryliu0820-patch-1    -> origin/larryliu0820-patch-1
2025-12-04T11:11:09.6423415Z  * [new branch]                leslie/test_group_gemm_epilogues -> origin/leslie/test_group_gemm_epilogues
2025-12-04T11:11:09.6423522Z  * [new branch]                lessw2020/fix_cutlass_cache_error -> origin/lessw2020/fix_cutlass_cache_error
2025-12-04T11:11:09.6423610Z  * [new branch]                liaoxuan/shm_all_reduce -> origin/liaoxuan/shm_all_reduce
2025-12-04T11:11:09.6423714Z  * [new branch]                liaoxuan/test_fa_disable_softmax -> origin/liaoxuan/test_fa_disable_softmax
2025-12-04T11:11:09.6423799Z  * [new branch]                liaoxuan/test_int8_sdpa -> origin/liaoxuan/test_int8_sdpa
2025-12-04T11:11:09.6423872Z  * [new branch]                llama4-stable           -> origin/llama4-stable
2025-12-04T11:11:09.6423938Z  * [new branch]                lts/release/1.8         -> origin/lts/release/1.8
2025-12-04T11:11:09.6424018Z  * [new branch]                lucaskabela/#94773      -> origin/lucaskabela/#94773
2025-12-04T11:11:09.6424099Z  * [new branch]                lucaskabela/fix_164876  -> origin/lucaskabela/fix_164876
2025-12-04T11:11:09.6424190Z  * [new branch]                lucaskabela/flop_counter -> origin/lucaskabela/flop_counter
2025-12-04T11:11:09.6424290Z  * [new branch]                lucaskabela/func_under_decomp -> origin/lucaskabela/func_under_decomp
2025-12-04T11:11:09.6424398Z  * [new branch]                lucaskabela/functional_in_dynamo -> origin/lucaskabela/functional_in_dynamo
2025-12-04T11:11:09.6424530Z  * [new branch]                lucaskabela/install_params_as_graph_attr -> origin/lucaskabela/install_params_as_graph_attr
2025-12-04T11:11:09.6424645Z  * [new branch]                lucaskabela/parameters_as_graph_attr -> origin/lucaskabela/parameters_as_graph_attr
2025-12-04T11:11:09.6424778Z  * [new branch]                lucaskabela/remove_aot_dispatcher_metadata -> origin/lucaskabela/remove_aot_dispatcher_metadata
2025-12-04T11:11:09.6424864Z  * [new branch]                lucaskabela/rnn_decomp  -> origin/lucaskabela/rnn_decomp
2025-12-04T11:11:09.6424959Z  * [new branch]                lucaskabela/typing_backends -> origin/lucaskabela/typing_backends
2025-12-04T11:11:09.6425059Z  * [new branch]                lucaskabela/typing_ctx_manager -> origin/lucaskabela/typing_ctx_manager
2025-12-04T11:11:09.6425160Z  * [new branch]                lucaskabela/typing_nn_module -> origin/lucaskabela/typing_nn_module
2025-12-04T11:11:09.6425263Z  * [new branch]                lucaskabela/typing_user_defined -> origin/lucaskabela/typing_user_defined
2025-12-04T11:11:09.6425362Z  * [new branch]                lucaskabela/typing_variables -> origin/lucaskabela/typing_variables
2025-12-04T11:11:09.6425475Z  * [new branch]                lucaskabela/typing_variables_dicts -> origin/lucaskabela/typing_variables_dicts
2025-12-04T11:11:09.6425600Z  * [new branch]                lucaskabela/typing_variables_functions -> origin/lucaskabela/typing_variables_functions
2025-12-04T11:11:09.6425712Z  * [new branch]                lucaskabela/typing_variables_lists -> origin/lucaskabela/typing_variables_lists
2025-12-04T11:11:09.6425789Z  * [new branch]                lw/torch_box_by_ref     -> origin/lw/torch_box_by_ref
2025-12-04T11:11:09.6425855Z  * [new branch]                main                    -> origin/main
2025-12-04T11:11:09.6425932Z  * [new branch]                malfet-patch-1          -> origin/malfet-patch-1
2025-12-04T11:11:09.6426005Z  * [new branch]                malfet-patch-2          -> origin/malfet-patch-2
2025-12-04T11:11:09.6426074Z  * [new branch]                malfet-patch-3          -> origin/malfet-patch-3
2025-12-04T11:11:09.6426170Z  * [new branch]                malfet-patch-4          -> origin/malfet-patch-4
2025-12-04T11:11:09.6426239Z  * [new branch]                malfet-patch-5          -> origin/malfet-patch-5
2025-12-04T11:11:09.6426328Z  * [new branch]                malfet-patch-6          -> origin/malfet-patch-6
2025-12-04T11:11:09.6426399Z  * [new branch]                malfet-patch-7          -> origin/malfet-patch-7
2025-12-04T11:11:09.6426466Z  * [new branch]                malfet-patch-8          -> origin/malfet-patch-8
2025-12-04T11:11:09.6426542Z  * [new branch]                malfet/add-3.14-ci      -> origin/malfet/add-3.14-ci
2025-12-04T11:11:09.6426710Z  * [new branch]                malfet/be-do-not-make-typos-in-build-artifacts -> origin/malfet/be-do-not-make-typos-in-build-artifacts
2025-12-04T11:11:09.6426882Z  * [new branch]                malfet/be-move-more-settings-to-checkout-pytorch -> origin/malfet/be-move-more-settings-to-checkout-pytorch
2025-12-04T11:11:09.6427013Z  * [new branch]                malfet/be-remove-misisng-neon-headers -> origin/malfet/be-remove-misisng-neon-headers
2025-12-04T11:11:09.6427118Z  * [new branch]                malfet/mps-implement-col2im -> origin/malfet/mps-implement-col2im
2025-12-04T11:11:09.6427239Z  * [new branch]                manuel/aoti_metal_shimify-thread_safe -> origin/manuel/aoti_metal_shimify-thread_safe
2025-12-04T11:11:09.6427338Z  * [new branch]                manuel/inductor_link_openmp -> origin/manuel/inductor_link_openmp
2025-12-04T11:11:09.6427417Z  * [new branch]                masnesral/metaconda     -> origin/masnesral/metaconda
2025-12-04T11:11:09.6427496Z  * [new branch]                mem_profiler_flaky_fix  -> origin/mem_profiler_flaky_fix
2025-12-04T11:11:09.6427583Z  * [new branch]                mem_profiler_stack_trace -> origin/mem_profiler_stack_trace
2025-12-04T11:11:09.6427661Z  * [new branch]                memory_profiler_stack   -> origin/memory_profiler_stack
2025-12-04T11:11:09.6427739Z  * [new branch]                metascroy-patch-1       -> origin/metascroy-patch-1
2025-12-04T11:11:09.6427812Z  * [new branch]                mingw_posix             -> origin/mingw_posix
2025-12-04T11:11:09.6427891Z  * [new branch]                mlazos/S429861-debug    -> origin/mlazos/S429861-debug
2025-12-04T11:11:09.6427958Z  * [new branch]                mlazos/aa               -> origin/mlazos/aa
2025-12-04T11:11:09.6428027Z  * [new branch]                mlazos/acts             -> origin/mlazos/acts
2025-12-04T11:11:09.6428102Z  * [new branch]                mlazos/arg-renames      -> origin/mlazos/arg-renames
2025-12-04T11:11:09.6428223Z  * [new branch]                mlazos/bad-cudagraphs   -> origin/mlazos/bad-cudagraphs
2025-12-04T11:11:09.6428332Z  * [new branch]                mlazos/baseline-graph-breaks -> origin/mlazos/baseline-graph-breaks
2025-12-04T11:11:09.6428408Z  * [new branch]                mlazos/beta-tensor      -> origin/mlazos/beta-tensor
2025-12-04T11:11:09.6428477Z  * [new branch]                mlazos/buffers          -> origin/mlazos/buffers
2025-12-04T11:11:09.6428550Z  * [new branch]                mlazos/buffers2         -> origin/mlazos/buffers2
2025-12-04T11:11:09.6428620Z  * [new branch]                mlazos/buffers3         -> origin/mlazos/buffers3
2025-12-04T11:11:09.6428686Z  * [new branch]                mlazos/bwd              -> origin/mlazos/bwd
2025-12-04T11:11:09.6428763Z  * [new branch]                mlazos/combo-test       -> origin/mlazos/combo-test
2025-12-04T11:11:09.6428837Z  * [new branch]                mlazos/ctx-cleanup      -> origin/mlazos/ctx-cleanup
2025-12-04T11:11:09.6428917Z  * [new branch]                mlazos/cuda-cmd-log     -> origin/mlazos/cuda-cmd-log
2025-12-04T11:11:09.6429001Z  * [new branch]                mlazos/cudagraph-tests  -> origin/mlazos/cudagraph-tests
2025-12-04T11:11:09.6429106Z  * [new branch]                mlazos/cudagraphs-measurement -> origin/mlazos/cudagraphs-measurement
2025-12-04T11:11:09.6429216Z  * [new branch]                mlazos/cutlass-test     -> origin/mlazos/cutlass-test
2025-12-04T11:11:09.6429301Z  * [new branch]                mlazos/cutlass-topo-bug -> origin/mlazos/cutlass-topo-bug
2025-12-04T11:11:09.6429384Z  * [new branch]                mlazos/dataclass-proxy  -> origin/mlazos/dataclass-proxy
2025-12-04T11:11:09.6429485Z  * [new branch]                mlazos/dc-attrs         -> origin/mlazos/dc-attrs
2025-12-04T11:11:09.6429556Z  * [new branch]                mlazos/dc-helion        -> origin/mlazos/dc-helion
2025-12-04T11:11:09.6429625Z  * [new branch]                mlazos/dict-fix         -> origin/mlazos/dict-fix
2025-12-04T11:11:09.6429702Z  * [new branch]                mlazos/disable-tf       -> origin/mlazos/disable-tf
2025-12-04T11:11:09.6429772Z  * [new branch]                mlazos/dupe-fix         -> origin/mlazos/dupe-fix
2025-12-04T11:11:09.6429842Z  * [new branch]                mlazos/dyn-batch        -> origin/mlazos/dyn-batch
2025-12-04T11:11:09.6429916Z  * [new branch]                mlazos/evt              -> origin/mlazos/evt
2025-12-04T11:11:09.6430001Z  * [new branch]                mlazos/extract-examples -> origin/mlazos/extract-examples
2025-12-04T11:11:09.6430074Z  * [new branch]                mlazos/foreach-op       -> origin/mlazos/foreach-op
2025-12-04T11:11:09.6430143Z  * [new branch]                mlazos/fp8              -> origin/mlazos/fp8
2025-12-04T11:11:09.6430212Z  * [new branch]                mlazos/fp8-bias         -> origin/mlazos/fp8-bias
2025-12-04T11:11:09.6430293Z  * [new branch]                mlazos/fp8-bias-fusion  -> origin/mlazos/fp8-bias-fusion
2025-12-04T11:11:09.6430369Z  * [new branch]                mlazos/fp8-fixes        -> origin/mlazos/fp8-fixes
2025-12-04T11:11:09.6430437Z  * [new branch]                mlazos/freezing         -> origin/mlazos/freezing
2025-12-04T11:11:09.6430506Z  * [new branch]                mlazos/h-comp           -> origin/mlazos/h-comp
2025-12-04T11:11:09.6430581Z  * [new branch]                mlazos/h-comp2          -> origin/mlazos/h-comp2
2025-12-04T11:11:09.6430650Z  * [new branch]                mlazos/hash-hop         -> origin/mlazos/hash-hop
2025-12-04T11:11:09.6430716Z  * [new branch]                mlazos/hc               -> origin/mlazos/hc
2025-12-04T11:11:09.6430789Z  * [new branch]                mlazos/hc-cycles        -> origin/mlazos/hc-cycles
2025-12-04T11:11:09.6430857Z  * [new branch]                mlazos/hc-fixes         -> origin/mlazos/hc-fixes
2025-12-04T11:11:09.6430930Z  * [new branch]                mlazos/hc-fixes3        -> origin/mlazos/hc-fixes3
2025-12-04T11:11:09.6431000Z  * [new branch]                mlazos/hc-fixes4        -> origin/mlazos/hc-fixes4
2025-12-04T11:11:09.6431067Z  * [new branch]                mlazos/hc-hf            -> origin/mlazos/hc-hf
2025-12-04T11:11:09.6431137Z  * [new branch]                mlazos/hc-mut           -> origin/mlazos/hc-mut
2025-12-04T11:11:09.6431202Z  * [new branch]                mlazos/hc10             -> origin/mlazos/hc10
2025-12-04T11:11:09.6431268Z  * [new branch]                mlazos/hc11             -> origin/mlazos/hc11
2025-12-04T11:11:09.6431334Z  * [new branch]                mlazos/hc12             -> origin/mlazos/hc12
2025-12-04T11:11:09.6431398Z  * [new branch]                mlazos/hc13             -> origin/mlazos/hc13
2025-12-04T11:11:09.6431463Z  * [new branch]                mlazos/hc14             -> origin/mlazos/hc14
2025-12-04T11:11:09.6431529Z  * [new branch]                mlazos/hc15             -> origin/mlazos/hc15
2025-12-04T11:11:09.6431594Z  * [new branch]                mlazos/hc2              -> origin/mlazos/hc2
2025-12-04T11:11:09.6431658Z  * [new branch]                mlazos/hc4              -> origin/mlazos/hc4
2025-12-04T11:11:09.6431724Z  * [new branch]                mlazos/hc5              -> origin/mlazos/hc5
2025-12-04T11:11:09.6431788Z  * [new branch]                mlazos/hc6              -> origin/mlazos/hc6
2025-12-04T11:11:09.6431850Z  * [new branch]                mlazos/hc7              -> origin/mlazos/hc7
2025-12-04T11:11:09.6431939Z  * [new branch]                mlazos/hc8              -> origin/mlazos/hc8
2025-12-04T11:11:09.6432001Z  * [new branch]                mlazos/hc9              -> origin/mlazos/hc9
2025-12-04T11:11:09.6432105Z  * [new branch]                mlazos/hc_baseline2     -> origin/mlazos/hc_baseline2
2025-12-04T11:11:09.6432195Z  * [new branch]                mlazos/inductor-streams -> origin/mlazos/inductor-streams
2025-12-04T11:11:09.6432259Z  * [new branch]                mlazos/main             -> origin/mlazos/main
2025-12-04T11:11:09.6432323Z  * [new branch]                mlazos/mcg2             -> origin/mlazos/mcg2
2025-12-04T11:11:09.6432402Z  * [new branch]                mlazos/meta-guards      -> origin/mlazos/meta-guards
2025-12-04T11:11:09.6432508Z  * [new branch]                mlazos/mlazos/foreach-map-adam -> origin/mlazos/mlazos/foreach-map-adam
2025-12-04T11:11:09.6432607Z  * [new branch]                mlazos/mlazos/tf-mode-backup -> origin/mlazos/mlazos/tf-mode-backup
2025-12-04T11:11:09.6432686Z  * [new branch]                mlazos/mod-fix          -> origin/mlazos/mod-fix
2025-12-04T11:11:09.6432757Z  * [new branch]                mlazos/mode-fix         -> origin/mlazos/mode-fix
2025-12-04T11:11:09.6432828Z  * [new branch]                mlazos/offsets          -> origin/mlazos/offsets
2025-12-04T11:11:09.6432904Z  * [new branch]                mlazos/overguarding     -> origin/mlazos/overguarding
2025-12-04T11:11:09.6432979Z  * [new branch]                mlazos/proxy-ctors      -> origin/mlazos/proxy-ctors
2025-12-04T11:11:09.6433053Z  * [new branch]                mlazos/quant-fix        -> origin/mlazos/quant-fix
2025-12-04T11:11:09.6433125Z  * [new branch]                mlazos/resnet-fix       -> origin/mlazos/resnet-fix
2025-12-04T11:11:09.6433201Z  * [new branch]                mlazos/rm-buf-names     -> origin/mlazos/rm-buf-names
2025-12-04T11:11:09.6433272Z  * [new branch]                mlazos/rm-code          -> origin/mlazos/rm-code
2025-12-04T11:11:09.6433343Z  * [new branch]                mlazos/rm-spam          -> origin/mlazos/rm-spam
2025-12-04T11:11:09.6433409Z  * [new branch]                mlazos/rtp              -> origin/mlazos/rtp
2025-12-04T11:11:09.6433494Z  * [new branch]                mlazos/static-idx-dbg   -> origin/mlazos/static-idx-dbg
2025-12-04T11:11:09.6433582Z  * [new branch]                mlazos/static-inputs-log -> origin/mlazos/static-inputs-log
2025-12-04T11:11:09.6433649Z  * [new branch]                mlazos/stests           -> origin/mlazos/stests
2025-12-04T11:11:09.6433726Z  * [new branch]                mlazos/stream-ops       -> origin/mlazos/stream-ops
2025-12-04T11:11:09.6433794Z  * [new branch]                mlazos/td-fix2          -> origin/mlazos/td-fix2
2025-12-04T11:11:09.6433876Z  * [new branch]                mlazos/tensor-hasattr2  -> origin/mlazos/tensor-hasattr2
2025-12-04T11:11:09.6433944Z  * [new branch]                mlazos/test             -> origin/mlazos/test
2025-12-04T11:11:09.6434013Z  * [new branch]                mlazos/tf-mode          -> origin/mlazos/tf-mode
2025-12-04T11:11:09.6434094Z  * [new branch]                mlazos/tf-mode-backup2  -> origin/mlazos/tf-mode-backup2
2025-12-04T11:11:09.6434178Z  * [new branch]                mlazos/tf-mode-reland   -> origin/mlazos/tf-mode-reland
2025-12-04T11:11:09.6434258Z  * [new branch]                mlazos/tf-mode-reland2  -> origin/mlazos/tf-mode-reland2
2025-12-04T11:11:09.6434338Z  * [new branch]                mlazos/tf-mode-reland3  -> origin/mlazos/tf-mode-reland3
2025-12-04T11:11:09.6434421Z  * [new branch]                mlazos/triton-no-epi    -> origin/mlazos/triton-no-epi
2025-12-04T11:11:09.6434495Z  * [new branch]                mlazos/tune-proto       -> origin/mlazos/tune-proto
2025-12-04T11:11:09.6434574Z  * [new branch]                mlazos/tuple-fixes      -> origin/mlazos/tuple-fixes
2025-12-04T11:11:09.6434652Z  * [new branch]                mlazos/tuple-fixes2     -> origin/mlazos/tuple-fixes2
2025-12-04T11:11:09.6434752Z  * [new branch]                mlazos/tuple-handling   -> origin/mlazos/tuple-handling
2025-12-04T11:11:09.6434840Z  * [new branch]                mlazos/user-stream-base -> origin/mlazos/user-stream-base
2025-12-04T11:11:09.6434938Z  * [new branch]                mlazos/user-streams     -> origin/mlazos/user-streams
2025-12-04T11:11:09.6435032Z  * [new branch]                mlazos/user-streams-backup -> origin/mlazos/user-streams-backup
2025-12-04T11:11:09.6435133Z  * [new branch]                mlazos/user-streams-backup2 -> origin/mlazos/user-streams-backup2
2025-12-04T11:11:09.6435205Z  * [new branch]                mlazos/vary-beta        -> origin/mlazos/vary-beta
2025-12-04T11:11:09.6435277Z  * [new branch]                mlazos/vary-beta2       -> origin/mlazos/vary-beta2
2025-12-04T11:11:09.6435356Z  * [new branch]                mlazos/weird-perf1      -> origin/mlazos/weird-perf1
2025-12-04T11:11:09.6435431Z  * [new branch]                mm_out_dtype_compile    -> origin/mm_out_dtype_compile
2025-12-04T11:11:09.6435499Z  * [new branch]                module-shim             -> origin/module-shim
2025-12-04T11:11:09.6435567Z  * [new branch]                move_config             -> origin/move_config
2025-12-04T11:11:09.6435641Z  * [new branch]                msaroufim/reduce        -> origin/msaroufim/reduce
2025-12-04T11:11:09.6435713Z  * [new branch]                mtia/basic-cmake        -> origin/mtia/basic-cmake
2025-12-04T11:11:09.6435823Z  * [new branch]                mwizak/fix-triton-block-shape -> origin/mwizak/fix-triton-block-shape
2025-12-04T11:11:09.6435892Z  * [new branch]                my_varlen_backup        -> origin/my_varlen_backup
2025-12-04T11:11:09.6435969Z  * [new branch]                nativert_num_outputs    -> origin/nativert_num_outputs
2025-12-04T11:11:09.6436041Z  * [new branch]                new-codegen             -> origin/new-codegen
2025-12-04T11:11:09.6436110Z  * [new branch]                newtest-base            -> origin/newtest-base
2025-12-04T11:11:09.6436185Z  * [new branch]                ngimel/addmm_dtype      -> origin/ngimel/addmm_dtype
2025-12-04T11:11:09.6436258Z  * [new branch]                ngimel/div_inv          -> origin/ngimel/div_inv
2025-12-04T11:11:09.6436340Z  * [new branch]                ngimel/error_index_list -> origin/ngimel/error_index_list
2025-12-04T11:11:09.6436415Z  * [new branch]                ngimel/gather_grid      -> origin/ngimel/gather_grid
2025-12-04T11:11:09.6436507Z  * [new branch]                ngimel/gather_grid_release -> origin/ngimel/gather_grid_release
2025-12-04T11:11:09.6436574Z  * [new branch]                ngimel/gg_new           -> origin/ngimel/gg_new
2025-12-04T11:11:09.6436648Z  * [new branch]                ngimel/hostalloc        -> origin/ngimel/hostalloc
2025-12-04T11:11:09.6436720Z  * [new branch]                ngimel/storage_id       -> origin/ngimel/storage_id
2025-12-04T11:11:09.6436785Z  * [new branch]                nightly                 -> origin/nightly
2025-12-04T11:11:09.6436910Z  * [new branch]                nikitaved/addmm_1_rowcol_lt_path_check -> origin/nikitaved/addmm_1_rowcol_lt_path_check
2025-12-04T11:11:09.6437036Z  * [new branch]                nikitaved/addmm_epilogue_fusions_2d_bias -> origin/nikitaved/addmm_epilogue_fusions_2d_bias
2025-12-04T11:11:09.6437167Z  * [new branch]                nikitaved/addmm_epilogue_fusions_inductor -> origin/nikitaved/addmm_epilogue_fusions_inductor
2025-12-04T11:11:09.6437296Z  * [new branch]                nikitaved/addmm_epilogue_fusions_scratch -> origin/nikitaved/addmm_epilogue_fusions_scratch
2025-12-04T11:11:09.6437416Z  * [new branch]                nikitaved/grad_addmm_epilogue_fusions -> origin/nikitaved/grad_addmm_epilogue_fusions
2025-12-04T11:11:09.6437531Z  * [new branch]                nikitaved/simpler_can_use_32bit_index -> origin/nikitaved/simpler_can_use_32bit_index
2025-12-04T11:11:09.6437605Z  * [new branch]                nikitaved/test          -> origin/nikitaved/test
2025-12-04T11:11:09.6437756Z  * [new branch]                nmacchioni-perf-test-async-autotune -> origin/nmacchioni-perf-test-async-autotune
2025-12-04T11:11:09.6437838Z  * [new branch]                no_distributed_log_spew -> origin/no_distributed_log_spew
2025-12-04T11:11:09.6437926Z  * [new branch]                nofun-hack              -> origin/nofun-hack
2025-12-04T11:11:09.6437989Z  * [new branch]                norm_bench              -> origin/norm_bench
2025-12-04T11:11:09.6438071Z  * [new branch]                nullplay/fuse_matmul    -> origin/nullplay/fuse_matmul
2025-12-04T11:11:09.6438184Z  * [new branch]                nullplay_fuse_matmul    -> origin/nullplay_fuse_matmul
2025-12-04T11:11:09.6438254Z  * [new branch]                optimizer_test          -> origin/optimizer_test
2025-12-04T11:11:09.6438330Z  * [new branch]                orig/release/1.10       -> origin/orig/release/1.10
2025-12-04T11:11:09.6438403Z  * [new branch]                orig/release/1.11       -> origin/orig/release/1.11
2025-12-04T11:11:09.6438474Z  * [new branch]                orig/release/1.12       -> origin/orig/release/1.12
2025-12-04T11:11:09.6438547Z  * [new branch]                orig/release/1.13       -> origin/orig/release/1.13
2025-12-04T11:11:09.6438614Z  * [new branch]                orig/release/1.6        -> origin/orig/release/1.6
2025-12-04T11:11:09.6438685Z  * [new branch]                orig/release/1.7        -> origin/orig/release/1.7
2025-12-04T11:11:09.6438752Z  * [new branch]                orig/release/1.8        -> origin/orig/release/1.8
2025-12-04T11:11:09.6438818Z  * [new branch]                orig/release/1.9        -> origin/orig/release/1.9
2025-12-04T11:11:09.6438885Z  * [new branch]                orig/release/2.0        -> origin/orig/release/2.0
2025-12-04T11:11:09.6438950Z  * [new branch]                orig/release/2.1        -> origin/orig/release/2.1
2025-12-04T11:11:09.6439017Z  * [new branch]                orig/release/2.2        -> origin/orig/release/2.2
2025-12-04T11:11:09.6439087Z  * [new branch]                orig/release/2.3        -> origin/orig/release/2.3
2025-12-04T11:11:09.6439153Z  * [new branch]                orig/release/2.4        -> origin/orig/release/2.4
2025-12-04T11:11:09.6439218Z  * [new branch]                orig/release/2.5        -> origin/orig/release/2.5
2025-12-04T11:11:09.6439287Z  * [new branch]                orig/release/2.6        -> origin/orig/release/2.6
2025-12-04T11:11:09.6439352Z  * [new branch]                orig/release/2.7        -> origin/orig/release/2.7
2025-12-04T11:11:09.6439417Z  * [new branch]                orig/release/2.8        -> origin/orig/release/2.8
2025-12-04T11:11:09.6439485Z  * [new branch]                orig/release/2.9        -> origin/orig/release/2.9
2025-12-04T11:11:09.6439573Z  * [new branch]                origin/gh/fxdawnn/1/base -> origin/origin/gh/fxdawnn/1/base
2025-12-04T11:11:09.6439657Z  * [new branch]                origin/gh/fxdawnn/1/orig -> origin/origin/gh/fxdawnn/1/orig
2025-12-04T11:11:09.6439742Z  * [new branch]                origin/gh/zpcore/14/orig -> origin/origin/gh/zpcore/14/orig
2025-12-04T11:11:09.6439813Z  * [new branch]                oulgen-patch-1          -> origin/oulgen-patch-1
2025-12-04T11:11:09.6439883Z  * [new branch]                oulgen-patch-2          -> origin/oulgen-patch-2
2025-12-04T11:11:09.6439953Z  * [new branch]                oulgen-patch-3          -> origin/oulgen-patch-3
2025-12-04T11:11:09.6440022Z  * [new branch]                oulgen-patch-4          -> origin/oulgen-patch-4
2025-12-04T11:11:09.6440094Z  * [new branch]                padded-tensor           -> origin/padded-tensor
2025-12-04T11:11:09.6440160Z  * [new branch]                pca2                    -> origin/pca2
2025-12-04T11:11:09.6440235Z  * [new branch]                per_channel_backup      -> origin/per_channel_backup
2025-12-04T11:11:09.6440301Z  * [new branch]                perf_ops                -> origin/perf_ops
2025-12-04T11:11:09.6440365Z  * [new branch]                perf_ops_2_9            -> origin/perf_ops_2_9
2025-12-04T11:11:09.6440471Z  * [new branch]                pianpwk-patch-1         -> origin/pianpwk-patch-1
2025-12-04T11:11:09.6440562Z  * [new branch]                pianpwk/__draft_debug_mode -> origin/pianpwk/__draft_debug_mode
2025-12-04T11:11:09.6440699Z  * [new branch]                pianpwk/_debug_mode_for_triton_draft -> origin/pianpwk/_debug_mode_for_triton_draft
2025-12-04T11:11:09.6440802Z  * [new branch]                pianpwk/_debug_nn_module_compile -> origin/pianpwk/_debug_nn_module_compile
2025-12-04T11:11:09.6440893Z  * [new branch]                pianpwk/_draft_triton_11_3 -> origin/pianpwk/_draft_triton_11_3
2025-12-04T11:11:09.6440987Z  * [new branch]                pianpwk/_manual_bucket_draft -> origin/pianpwk/_manual_bucket_draft
2025-12-04T11:11:09.6441091Z  * [new branch]                pianpwk/_profile_w_dispatch_keys -> origin/pianpwk/_profile_w_dispatch_keys
2025-12-04T11:11:09.6441191Z  * [new branch]                pianpwk/_super_draft_debug_mode -> origin/pianpwk/_super_draft_debug_mode
2025-12-04T11:11:09.6441299Z  * [new branch]                pianpwk/_unbacked_local_shard_size -> origin/pianpwk/_unbacked_local_shard_size
2025-12-04T11:11:09.6441375Z  * [new branch]                pianpwk/anomaly_tb      -> origin/pianpwk/anomaly_tb
2025-12-04T11:11:09.6441463Z  * [new branch]                pianpwk/auto_fx_annotate -> origin/pianpwk/auto_fx_annotate
2025-12-04T11:11:09.6441578Z  * [new branch]                pianpwk/backed_size_oblivious_export -> origin/pianpwk/backed_size_oblivious_export
2025-12-04T11:11:09.6441668Z  * [new branch]                pianpwk/bert_dynamic_perf -> origin/pianpwk/bert_dynamic_perf
2025-12-04T11:11:09.6441766Z  * [new branch]                pianpwk/debug_fwd_stack_traces -> origin/pianpwk/debug_fwd_stack_traces
2025-12-04T11:11:09.6441852Z  * [new branch]                pianpwk/debug_hash_tensor -> origin/pianpwk/debug_hash_tensor
2025-12-04T11:11:09.6441944Z  * [new branch]                pianpwk/debug_mode_annotate -> origin/pianpwk/debug_mode_annotate
2025-12-04T11:11:09.6442035Z  * [new branch]                pianpwk/debug_mode_defaults -> origin/pianpwk/debug_mode_defaults
2025-12-04T11:11:09.6442119Z  * [new branch]                pianpwk/debug_mode_hacks -> origin/pianpwk/debug_mode_hacks
2025-12-04T11:11:09.6442230Z  * [new branch]                pianpwk/debug_mode_opcall_refactor -> origin/pianpwk/debug_mode_opcall_refactor
2025-12-04T11:11:09.6442321Z  * [new branch]                pianpwk/debug_mode_show_ids -> origin/pianpwk/debug_mode_show_ids
2025-12-04T11:11:09.6442405Z  * [new branch]                pianpwk/debug_mode_triton -> origin/pianpwk/debug_mode_triton
2025-12-04T11:11:09.6442504Z  * [new branch]                pianpwk/debug_show_stack_trace -> origin/pianpwk/debug_show_stack_trace
2025-12-04T11:11:09.6442605Z  * [new branch]                pianpwk/debug_wait_on_collective -> origin/pianpwk/debug_wait_on_collective
2025-12-04T11:11:09.6442704Z  * [new branch]                pianpwk/debugmode_compile_tf -> origin/pianpwk/debugmode_compile_tf
2025-12-04T11:11:09.6442833Z  * [new branch]                pianpwk/dispatch_key_debugging_for_debug -> origin/pianpwk/dispatch_key_debugging_for_debug
2025-12-04T11:11:09.6442940Z  * [new branch]                pianpwk/draft_debug_mode_tfcompile -> origin/pianpwk/draft_debug_mode_tfcompile
2025-12-04T11:11:09.6443040Z  * [new branch]                pianpwk/draft_multikernel_nn -> origin/pianpwk/draft_multikernel_nn
2025-12-04T11:11:09.6443156Z  * [new branch]                pianpwk/draft_multikernel_status_10_5 -> origin/pianpwk/draft_multikernel_status_10_5
2025-12-04T11:11:09.6443251Z  * [new branch]                pianpwk/dtensor_custom_chunk -> origin/pianpwk/dtensor_custom_chunk
2025-12-04T11:11:09.6443357Z  * [new branch]                pianpwk/dtensor_unbacked_keypath -> origin/pianpwk/dtensor_unbacked_keypath
2025-12-04T11:11:09.6443438Z  * [new branch]                pianpwk/event_list_tree -> origin/pianpwk/event_list_tree
2025-12-04T11:11:09.6443550Z  * [new branch]                pianpwk/false_numel_refs -> origin/pianpwk/false_numel_refs
2025-12-04T11:11:09.6443632Z  * [new branch]                pianpwk/maybe_guard_rel -> origin/pianpwk/maybe_guard_rel
2025-12-04T11:11:09.6443737Z  * [new branch]                pianpwk/multikernel_hints_draft -> origin/pianpwk/multikernel_hints_draft
2025-12-04T11:11:09.6443874Z  * [new branch]                pianpwk/no_size_oblivious_slice_scat -> origin/pianpwk/no_size_oblivious_slice_scat
2025-12-04T11:11:09.6443992Z  * [new branch]                pianpwk/oblivious_reshape_view_better -> origin/pianpwk/oblivious_reshape_view_better
2025-12-04T11:11:09.6444076Z  * [new branch]                pianpwk/pre_forward_hook -> origin/pianpwk/pre_forward_hook
2025-12-04T11:11:09.6444185Z  * [new branch]                pianpwk/skip_python_keys_alternate -> origin/pianpwk/skip_python_keys_alternate
2025-12-04T11:11:09.6444293Z  * [new branch]                pianpwk/skip_python_keys_in_guards -> origin/pianpwk/skip_python_keys_in_guards
2025-12-04T11:11:09.6444377Z  * [new branch]                pianpwk/sym_tokens_draft -> origin/pianpwk/sym_tokens_draft
2025-12-04T11:11:09.6444460Z  * [new branch]                pianpwk/symint_one_hot  -> origin/pianpwk/symint_one_hot
2025-12-04T11:11:09.6444576Z  * [new branch]                pianpwk/test_pointwise_guard_or_false -> origin/pianpwk/test_pointwise_guard_or_false
2025-12-04T11:11:09.6444675Z  * [new branch]                pianpwk/totally_draft_sym_wrap -> origin/pianpwk/totally_draft_sym_wrap
2025-12-04T11:11:09.6444761Z  * [new branch]                pianpwk/try_dumb_stuff  -> origin/pianpwk/try_dumb_stuff
2025-12-04T11:11:09.6444841Z  * [new branch]                pianpwk/try_dumb_stuff_2 -> origin/pianpwk/try_dumb_stuff_2
2025-12-04T11:11:09.6444933Z  * [new branch]                pianpwk/unbacked_dtensor_mm -> origin/pianpwk/unbacked_dtensor_mm
2025-12-04T11:11:09.6445031Z  * [new branch]                pianpwk/unbacked_tracing_12_2 -> origin/pianpwk/unbacked_tracing_12_2
2025-12-04T11:11:09.6445109Z  * [new branch]                pianpwk/user_symints    -> origin/pianpwk/user_symints
2025-12-04T11:11:09.6445188Z  * [new branch]                pianpwk/wan21_reshape   -> origin/pianpwk/wan21_reshape
2025-12-04T11:11:09.6445283Z  * [new branch]                piz/fix_partial_backward_1112 -> origin/piz/fix_partial_backward_1112
2025-12-04T11:11:09.6445361Z  * [new branch]                piz/prop_cache_clean    -> origin/piz/prop_cache_clean
2025-12-04T11:11:09.6445430Z  * [new branch]                pool-separate           -> origin/pool-separate
2025-12-04T11:11:09.6445496Z  * [new branch]                pr-156087               -> origin/pr-156087
2025-12-04T11:11:09.6445557Z  * [new branch]                pr/131860               -> origin/pr/131860
2025-12-04T11:11:09.6445627Z  * [new branch]                predispatch_to          -> origin/predispatch_to
2025-12-04T11:11:09.6445696Z  * [new branch]                protect-c17             -> origin/protect-c17
2025-12-04T11:11:09.6445767Z  * [new branch]                pt-opt-cuda3            -> origin/pt-opt-cuda3
2025-12-04T11:11:09.6445851Z  * [new branch]                python_compiled_autograd -> origin/python_compiled_autograd
2025-12-04T11:11:09.6445985Z  * [new branch]                q1l1/fix_device_moved_constant_type_unknown -> origin/q1l1/fix_device_moved_constant_type_unknown
2025-12-04T11:11:09.6446127Z  * [new branch]                q1l1/fix_wrong_default_type_for_kernel_call_args -> origin/q1l1/fix_wrong_default_type_for_kernel_call_args
2025-12-04T11:11:09.6446210Z  * [new branch]                qchip/export-D54134695  -> origin/qchip/export-D54134695
2025-12-04T11:11:09.6446286Z  * [new branch]                quote-pytest_cache      -> origin/quote-pytest_cache
2025-12-04T11:11:09.6446385Z  * [new branch]                reland-accgrad-stream-warn -> origin/reland-accgrad-stream-warn
2025-12-04T11:11:09.6446451Z  * [new branch]                release/1.10            -> origin/release/1.10
2025-12-04T11:11:09.6446544Z  * [new branch]                release/1.11            -> origin/release/1.11
2025-12-04T11:11:09.6446608Z  * [new branch]                release/1.12            -> origin/release/1.12
2025-12-04T11:11:09.6446674Z  * [new branch]                release/1.13            -> origin/release/1.13
2025-12-04T11:11:09.6446759Z  * [new branch]                release/1.4             -> origin/release/1.4
2025-12-04T11:11:09.6446825Z  * [new branch]                release/1.4.1           -> origin/release/1.4.1
2025-12-04T11:11:09.6446888Z  * [new branch]                release/1.5             -> origin/release/1.5
2025-12-04T11:11:09.6446950Z  * [new branch]                release/1.6             -> origin/release/1.6
2025-12-04T11:11:09.6447012Z  * [new branch]                release/1.7             -> origin/release/1.7
2025-12-04T11:11:09.6447075Z  * [new branch]                release/1.8             -> origin/release/1.8
2025-12-04T11:11:09.6447136Z  * [new branch]                release/1.9             -> origin/release/1.9
2025-12-04T11:11:09.6447198Z  * [new branch]                release/2.0             -> origin/release/2.0
2025-12-04T11:11:09.6447260Z  * [new branch]                release/2.1             -> origin/release/2.1
2025-12-04T11:11:09.6447321Z  * [new branch]                release/2.2             -> origin/release/2.2
2025-12-04T11:11:09.6447383Z  * [new branch]                release/2.3             -> origin/release/2.3
2025-12-04T11:11:09.6447446Z  * [new branch]                release/2.4             -> origin/release/2.4
2025-12-04T11:11:09.6447507Z  * [new branch]                release/2.5             -> origin/release/2.5
2025-12-04T11:11:09.6447567Z  * [new branch]                release/2.6             -> origin/release/2.6
2025-12-04T11:11:09.6447629Z  * [new branch]                release/2.7             -> origin/release/2.7
2025-12-04T11:11:09.6447690Z  * [new branch]                release/2.8             -> origin/release/2.8
2025-12-04T11:11:09.6447750Z  * [new branch]                release/2.9             -> origin/release/2.9
2025-12-04T11:11:09.6447819Z  * [new branch]                release_notes           -> origin/release_notes
2025-12-04T11:11:09.6447896Z  * [new branch]                remove_pyinterpreter    -> origin/remove_pyinterpreter
2025-12-04T11:11:09.6448024Z  * [new branch]                replace-pytorch-labs-20250812-195836 -> origin/replace-pytorch-labs-20250812-195836
2025-12-04T11:11:09.6448186Z  * [new branch]                replace-pytorch-labs-20250812-200248 -> origin/replace-pytorch-labs-20250812-200248
2025-12-04T11:11:09.6448306Z  * [new branch]                replace-pytorch-labs-20250812-200324 -> origin/replace-pytorch-labs-20250812-200324
2025-12-04T11:11:09.6448426Z  * [new branch]                replace-pytorch-labs-20250812-204020 -> origin/replace-pytorch-labs-20250812-204020
2025-12-04T11:11:09.6448558Z  * [new branch]                revert-131069-gh/krzysztofjordan/1/head -> origin/revert-131069-gh/krzysztofjordan/1/head
2025-12-04T11:11:09.6448671Z  * [new branch]                revert-131469-gh/andrewor14/51/head -> origin/revert-131469-gh/andrewor14/51/head
2025-12-04T11:11:09.6448776Z  * [new branch]                revert-152361-gh/fadara01/1/head -> origin/revert-152361-gh/fadara01/1/head
2025-12-04T11:11:09.6448880Z  * [new branch]                revert-156870-gh/skarjala/3/head -> origin/revert-156870-gh/skarjala/3/head
2025-12-04T11:11:09.6449053Z  * [new branch]                revert-157914-cherry-pick-157503-by-pytorch_bot_bot_ -> origin/revert-157914-cherry-pick-157503-by-pytorch_bot_bot_
2025-12-04T11:11:09.6449152Z  * [new branch]                revert-hoo-invoke-subgraph -> origin/revert-hoo-invoke-subgraph
2025-12-04T11:11:09.6449251Z  * [new branch]                revert_always_build_distributed -> origin/revert_always_build_distributed
2025-12-04T11:11:09.6449322Z  * [new branch]                rms_norm_patch          -> origin/rms_norm_patch
2025-12-04T11:11:09.6449420Z  * [new branch]                ruisi/fix_all_to_all_estimation -> origin/ruisi/fix_all_to_all_estimation
2025-12-04T11:11:09.6449535Z  * [new branch]                ruisi/fix_comm_estimation -> origin/ruisi/fix_comm_estimation
2025-12-04T11:11:09.6449644Z  * [new branch]                ruisi/fix_dynamic_shape_estimation -> origin/ruisi/fix_dynamic_shape_estimation
2025-12-04T11:11:09.6449770Z  * [new branch]                ruisi/fix_llama3_autobucketing -> origin/ruisi/fix_llama3_autobucketing
2025-12-04T11:11:09.6449875Z  * [new branch]                ruisi/fix_manual_bucketing_ep_pass -> origin/ruisi/fix_manual_bucketing_ep_pass
2025-12-04T11:11:09.6449959Z  * [new branch]                ruisi/manual_bucket_pass -> origin/ruisi/manual_bucket_pass
2025-12-04T11:11:09.6450107Z  * [new branch]                ryanguo99/cleanup-dynamo-expected-failures -> origin/ryanguo99/cleanup-dynamo-expected-failures
2025-12-04T11:11:09.6450195Z  * [new branch]                ryanguo99/fix-closure-var -> origin/ryanguo99/fix-closure-var
2025-12-04T11:11:09.6450275Z  * [new branch]                rzou/faketensor_bench   -> origin/rzou/faketensor_bench
2025-12-04T11:11:09.6450339Z  * [new branch]                rzou/njt                -> origin/rzou/njt
2025-12-04T11:11:09.6450402Z  * [new branch]                rzou/pca                -> origin/rzou/pca
2025-12-04T11:11:09.6450472Z  * [new branch]                rzou/realprop           -> origin/rzou/realprop
2025-12-04T11:11:09.6450536Z  * [new branch]                samplevllm              -> origin/samplevllm
2025-12-04T11:11:09.6450704Z  * [new branch]                sanchitintel/weird_thing_with_test_cpu_select_algorithm -> origin/sanchitintel/weird_thing_with_test_cpu_select_algorithm
2025-12-04T11:11:09.6450799Z  * [new branch]                sapling-pr-archive-SS-JIA -> origin/sapling-pr-archive-SS-JIA
2025-12-04T11:11:09.6450913Z  * [new branch]                sapling-pr-archive-tushar00jain -> origin/sapling-pr-archive-tushar00jain
2025-12-04T11:11:09.6450974Z  * [new branch]                save                    -> origin/save
2025-12-04T11:11:09.6451037Z  * [new branch]                scaled_mm               -> origin/scaled_mm
2025-12-04T11:11:09.6451102Z  * [new branch]                scan_attempt            -> origin/scan_attempt
2025-12-04T11:11:09.6451166Z  * [new branch]                sdym/2.5.1              -> origin/sdym/2.5.1
2025-12-04T11:11:09.6451279Z  * [new branch]                sekyondaMeta-dynamoconfig-fix -> origin/sekyondaMeta-dynamoconfig-fix
2025-12-04T11:11:09.6451356Z  * [new branch]                shengf/fx-xform-perf    -> origin/shengf/fx-xform-perf
2025-12-04T11:11:09.6451436Z  * [new branch]                shoumikhin-patch-1      -> origin/shoumikhin-patch-1
2025-12-04T11:11:09.6451513Z  * [new branch]                solve-accuracy-fix      -> origin/solve-accuracy-fix
2025-12-04T11:11:09.6451594Z  * [new branch]                some_rocm_inductor_skips -> origin/some_rocm_inductor_skips
2025-12-04T11:11:09.6451676Z  * [new branch]                soulitzer/stash-tls-ac  -> origin/soulitzer/stash-tls-ac
2025-12-04T11:11:09.6451761Z  * [new branch]                sparse-mm-bf16-support  -> origin/sparse-mm-bf16-support
2025-12-04T11:11:09.6451834Z  * [new branch]                starterTaskUpdate       -> origin/starterTaskUpdate
2025-12-04T11:11:09.6451895Z  * [new branch]                suo                     -> origin/suo
2025-12-04T11:11:09.6451959Z  * [new branch]                sve-poc                 -> origin/sve-poc
2025-12-04T11:11:09.6452022Z  * [new branch]                switch-bn               -> origin/switch-bn
2025-12-04T11:11:09.6452116Z  * [new branch]                sy_annotation_in_autograd_hop -> origin/sy_annotation_in_autograd_hop
2025-12-04T11:11:09.6452186Z  * [new branch]                sy_aot_eager_record     -> origin/sy_aot_eager_record
2025-12-04T11:11:09.6452256Z  * [new branch]                sy_custom_bucketing     -> origin/sy_custom_bucketing
2025-12-04T11:11:09.6452326Z  * [new branch]                sy_debug_mode_test      -> origin/sy_debug_mode_test
2025-12-04T11:11:09.6452412Z  * [new branch]                sy_deserialize          -> origin/sy_deserialize
2025-12-04T11:11:09.6452479Z  * [new branch]                sy_dump_gm_code         -> origin/sy_dump_gm_code
2025-12-04T11:11:09.6452542Z  * [new branch]                sy_exp                  -> origin/sy_exp
2025-12-04T11:11:09.6452637Z  * [new branch]                sy_export_annotation    -> origin/sy_export_annotation
2025-12-04T11:11:09.6452706Z  * [new branch]                sy_invoke_subgraph      -> origin/sy_invoke_subgraph
2025-12-04T11:11:09.6452777Z  * [new branch]                sy_kernel_bw_name       -> origin/sy_kernel_bw_name
2025-12-04T11:11:09.6452843Z  * [new branch]                sy_multi_arch           -> origin/sy_multi_arch
2025-12-04T11:11:09.6452912Z  * [new branch]                sy_nn_module_stack      -> origin/sy_nn_module_stack
2025-12-04T11:11:09.6452983Z  * [new branch]                sy_original_dtensor     -> origin/sy_original_dtensor
2025-12-04T11:11:09.6453049Z  * [new branch]                sy_profiler_cia         -> origin/sy_profiler_cia
2025-12-04T11:11:09.6453117Z  * [new branch]                symm_mem_sync           -> origin/symm_mem_sync
2025-12-04T11:11:09.6453202Z  * [new branch]                sympy-bottleneck-repro  -> origin/sympy-bottleneck-repro
2025-12-04T11:11:09.6453281Z  * [new branch]                tensordict_integration  -> origin/tensordict_integration
2025-12-04T11:11:09.6453363Z  * [new branch]                test-move-conda-builds  -> origin/test-move-conda-builds
2025-12-04T11:11:09.6453425Z  * [new branch]                test-old                -> origin/test-old
2025-12-04T11:11:09.6453489Z  * [new branch]                test/bmm_heur           -> origin/test/bmm_heur
2025-12-04T11:11:09.6453589Z  * [new branch]                tianren/customOp_autotune_fix -> origin/tianren/customOp_autotune_fix
2025-12-04T11:11:09.6453701Z  * [new branch]                tianren/customOp_enable_max_autotune -> origin/tianren/customOp_enable_max_autotune
2025-12-04T11:11:09.6453782Z  * [new branch]                tianren/customOp_fusion -> origin/tianren/customOp_fusion
2025-12-04T11:11:09.6453910Z  * [new branch]                tianren/customop_collectiveop_benchmark -> origin/tianren/customop_collectiveop_benchmark
2025-12-04T11:11:09.6454047Z  * [new branch]                tianren/customop_collectiveop_benchmark_fix -> origin/tianren/customop_collectiveop_benchmark_fix
2025-12-04T11:11:09.6454147Z  * [new branch]                tianren/customop_dynamic_config -> origin/tianren/customop_dynamic_config
2025-12-04T11:11:09.6454240Z  * [new branch]                tianren/dynamic_range_input -> origin/tianren/dynamic_range_input
2025-12-04T11:11:09.6454341Z  * [new branch]                tianren/dynamic_range_input_fix -> origin/tianren/dynamic_range_input_fix
2025-12-04T11:11:09.6454451Z  * [new branch]                tianren/dynamic_range_input_merge -> origin/tianren/dynamic_range_input_merge
2025-12-04T11:11:09.6454555Z  * [new branch]                tianren/flex_paged_attn_fix_temp -> origin/tianren/flex_paged_attn_fix_temp
2025-12-04T11:11:09.6454638Z  * [new branch]                tianren/fx_codegen_dump -> origin/tianren/fx_codegen_dump
2025-12-04T11:11:09.6454724Z  * [new branch]                tianren/symmetric_memory -> origin/tianren/symmetric_memory
2025-12-04T11:11:09.6454791Z  * [new branch]                tianren/test            -> origin/tianren/test
2025-12-04T11:11:09.6454867Z  * [new branch]                tidy_performance_cyy    -> origin/tidy_performance_cyy
2025-12-04T11:11:09.6454929Z  * [new branch]                tmp                     -> origin/tmp
2025-12-04T11:11:09.6454996Z  * [new branch]                torchtitan_ep           -> origin/torchtitan_ep
2025-12-04T11:11:09.6455074Z  * [new branch]                torchtitan_integration  -> origin/torchtitan_integration
2025-12-04T11:11:09.6455159Z  * [new branch]                trace_fsdp_torchtune_lora -> origin/trace_fsdp_torchtune_lora
2025-12-04T11:11:09.6455243Z  * [new branch]                traceable_fsdp_unit_tests -> origin/traceable_fsdp_unit_tests
2025-12-04T11:11:09.6455349Z  * [new branch]                tree_loop_vec_base      -> origin/tree_loop_vec_base
2025-12-04T11:11:09.6455418Z  * [new branch]                triton_kernel           -> origin/triton_kernel
2025-12-04T11:11:09.6455503Z  * [new branch]                tt_pkg_1908             -> origin/tt_pkg_1908
2025-12-04T11:11:09.6455565Z  * [new branch]                type_dec                -> origin/type_dec
2025-12-04T11:11:09.6455665Z  * [new branch]                udate-sphinx-dependancies -> origin/udate-sphinx-dependancies
2025-12-04T11:11:09.6455804Z  * [new branch]                update-audio-commit-hash/17630256502-1803-1 -> origin/update-audio-commit-hash/17630256502-1803-1
2025-12-04T11:11:09.6455941Z  * [new branch]                update-audio-commit-hash/19087141161-1916-1 -> origin/update-audio-commit-hash/19087141161-1916-1
2025-12-04T11:11:09.6456076Z  * [new branch]                update-audio-commit-hash/19250643381-1929-1 -> origin/update-audio-commit-hash/19250643381-1929-1
2025-12-04T11:11:09.6456211Z  * [new branch]                update-audio-commit-hash/19397724337-1935-1 -> origin/update-audio-commit-hash/19397724337-1935-1
2025-12-04T11:11:09.6456346Z  * [new branch]                update-audio-commit-hash/19555670148-1941-1 -> origin/update-audio-commit-hash/19555670148-1941-1
2025-12-04T11:11:09.6456478Z  * [new branch]                update-audio-commit-hash/19750627930-1946-1 -> origin/update-audio-commit-hash/19750627930-1946-1
2025-12-04T11:11:09.6456614Z  * [new branch]                update-triton-commit-hash/13663274526-1487-2 -> origin/update-triton-commit-hash/13663274526-1487-2
2025-12-04T11:11:09.6456752Z  * [new branch]                update-vision-commit-hash/19087141161-1916-1 -> origin/update-vision-commit-hash/19087141161-1916-1
2025-12-04T11:11:09.6456889Z  * [new branch]                update-vision-commit-hash/19184897099-1925-1 -> origin/update-vision-commit-hash/19184897099-1925-1
2025-12-04T11:11:09.6457026Z  * [new branch]                update-vision-commit-hash/19250643381-1929-1 -> origin/update-vision-commit-hash/19250643381-1929-1
2025-12-04T11:11:09.6457159Z  * [new branch]                update-vision-commit-hash/19381328640-1934-1 -> origin/update-vision-commit-hash/19381328640-1934-1
2025-12-04T11:11:09.6457296Z  * [new branch]                update-vision-commit-hash/19485237164-1938-1 -> origin/update-vision-commit-hash/19485237164-1938-1
2025-12-04T11:11:09.6457431Z  * [new branch]                update-vllm-commit-hash/18451675449-1879-1 -> origin/update-vllm-commit-hash/18451675449-1879-1
2025-12-04T11:11:09.6457518Z  * [new branch]                update-vllm-dockerfile  -> origin/update-vllm-dockerfile
2025-12-04T11:11:09.6457644Z  * [new branch]                update-xla-commit-hash/19224287370-211-1 -> origin/update-xla-commit-hash/19224287370-211-1
2025-12-04T11:11:09.6457772Z  * [new branch]                update-xla-commit-hash/19422028566-212-1 -> origin/update-xla-commit-hash/19422028566-212-1
2025-12-04T11:11:09.6457894Z  * [new branch]                update-xla-commit-hash/19626841311-213-1 -> origin/update-xla-commit-hash/19626841311-213-1
2025-12-04T11:11:09.6458022Z  * [new branch]                update_docs_torch_multinomial_issue#125388 -> origin/update_docs_torch_multinomial_issue#125388
2025-12-04T11:11:09.6458109Z  * [new branch]                update_operator_readme  -> origin/update_operator_readme
2025-12-04T11:11:09.6458235Z  * [new branch]                update_slow_tests_1722488736 -> origin/update_slow_tests_1722488736
2025-12-04T11:11:09.6458326Z  * [new branch]                update_slow_tests_1722879173 -> origin/update_slow_tests_1722879173
2025-12-04T11:11:09.6458414Z  * [new branch]                update_slow_tests_1762155677 -> origin/update_slow_tests_1762155677
2025-12-04T11:11:09.6458501Z  * [new branch]                update_slow_tests_1763365283 -> origin/update_slow_tests_1763365283
2025-12-04T11:11:09.6458618Z  * [new branch]                update_submodule_FBGEMM -> origin/update_submodule_FBGEMM
2025-12-04T11:11:09.6458728Z  * [new branch]                update_submodule_kineto -> origin/update_submodule_kineto
2025-12-04T11:11:09.6458820Z  * [new branch]                update_submodule_tensorpipe -> origin/update_submodule_tensorpipe
2025-12-04T11:11:09.6458947Z  * [new branch]                upload-tests-for-autorevert -> origin/upload-tests-for-autorevert
2025-12-04T11:11:09.6459012Z  * [new branch]                v0.1.2                  -> origin/v0.1.2
2025-12-04T11:11:09.6459074Z  * [new branch]                v1.0.1                  -> origin/v1.0.1
2025-12-04T11:11:09.6459136Z  * [new branch]                v1.0.3                  -> origin/v1.0.3
2025-12-04T11:11:09.6459195Z  * [new branch]                v1.1.0                  -> origin/v1.1.0
2025-12-04T11:11:09.6459254Z  * [new branch]                v1.2.0                  -> origin/v1.2.0
2025-12-04T11:11:09.6459312Z  * [new branch]                v1.3.0                  -> origin/v1.3.0
2025-12-04T11:11:09.6459371Z  * [new branch]                v1.3.1                  -> origin/v1.3.1
2025-12-04T11:11:09.6459437Z  * [new branch]                validate_fn             -> origin/validate_fn
2025-12-04T11:11:09.6459509Z  * [new branch]                validations_2.6         -> origin/validations_2.6
2025-12-04T11:11:09.6459583Z  * [new branch]                validations_2.8         -> origin/validations_2.8
2025-12-04T11:11:09.6459648Z  * [new branch]                varlen-api              -> origin/varlen-api
2025-12-04T11:11:09.6459728Z  * [new branch]                varlen-api-backup       -> origin/varlen-api-backup
2025-12-04T11:11:09.6459809Z  * [new branch]                varlen_batch_invariance -> origin/varlen_batch_invariance
2025-12-04T11:11:09.6459875Z  * [new branch]                viable/strict           -> origin/viable/strict
2025-12-04T11:11:09.6459995Z  * [new branch]                vishal9-team/dtensor_parallelism_toy -> origin/vishal9-team/dtensor_parallelism_toy
2025-12-04T11:11:09.6460062Z  * [new branch]                vllmbuildci             -> origin/vllmbuildci
2025-12-04T11:11:09.6460128Z  * [new branch]                vllmpin                 -> origin/vllmpin
2025-12-04T11:11:09.6460222Z  * [new branch]                vscode-recommend-pyrefly -> origin/vscode-recommend-pyrefly
2025-12-04T11:11:09.6460291Z  * [new branch]                wdvr-patch-1            -> origin/wdvr-patch-1
2025-12-04T11:11:09.6460358Z  * [new branch]                wdvr/iss_145259         -> origin/wdvr/iss_145259
2025-12-04T11:11:09.6460424Z  * [new branch]                whc/pei                 -> origin/whc/pei
2025-12-04T11:11:09.6460489Z  * [new branch]                whc/pp_fix              -> origin/whc/pp_fix
2025-12-04T11:11:09.6460552Z  * [new branch]                whc/sharding            -> origin/whc/sharding
2025-12-04T11:11:09.6460618Z  * [new branch]                whc/sharding2           -> origin/whc/sharding2
2025-12-04T11:11:09.6460679Z  * [new branch]                whc/uneven              -> origin/whc/uneven
2025-12-04T11:11:09.6460753Z  * [new branch]                whc/uneven-merge        -> origin/whc/uneven-merge
2025-12-04T11:11:09.6460817Z  * [new branch]                win_warnings            -> origin/win_warnings
2025-12-04T11:11:09.6460895Z  * [new branch]                windows_libtorch_free   -> origin/windows_libtorch_free
2025-12-04T11:11:09.6460961Z  * [new branch]                xmfan-war               -> origin/xmfan-war
2025-12-04T11:11:09.6461027Z  * [new branch]                xmfan/ca_0516           -> origin/xmfan/ca_0516
2025-12-04T11:11:09.6461097Z  * [new branch]                xmfan/ca_1051b93192     -> origin/xmfan/ca_1051b93192
2025-12-04T11:11:09.6461251Z  * [new branch]                xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8 -> origin/xmfan/ca_1a722f62c248391fc4a542e8851a5559aa356ae8
2025-12-04T11:11:09.6461323Z  * [new branch]                xmfan/ca_5a2be192d1     -> origin/xmfan/ca_5a2be192d1
2025-12-04T11:11:09.6461393Z  * [new branch]                xmfan/ca_9d59b516e9     -> origin/xmfan/ca_9d59b516e9
2025-12-04T11:11:09.6462351Z  * [new branch]                xmfan/ca_apr8           -> origin/xmfan/ca_apr8
2025-12-04T11:11:09.6462422Z  * [new branch]                xmfan/ca_base           -> origin/xmfan/ca_base
2025-12-04T11:11:09.6462511Z  * [new branch]                xmfan/ca_dynamic        -> origin/xmfan/ca_dynamic
2025-12-04T11:11:09.6462579Z  * [new branch]                xmfan/ca_fix_dyn        -> origin/xmfan/ca_fix_dyn
2025-12-04T11:11:09.6462654Z  * [new branch]                xmfan/ca_fix_lowering   -> origin/xmfan/ca_fix_lowering
2025-12-04T11:11:09.6462731Z  * [new branch]                xmfan/ca_fix_polyfills  -> origin/xmfan/ca_fix_polyfills
2025-12-04T11:11:09.6462797Z  * [new branch]                xmfan/ca_jan3           -> origin/xmfan/ca_jan3
2025-12-04T11:11:09.6462863Z  * [new branch]                xmfan/ca_jun18          -> origin/xmfan/ca_jun18
2025-12-04T11:11:09.6462929Z  * [new branch]                xmfan/ca_jun24          -> origin/xmfan/ca_jun24
2025-12-04T11:11:09.6462999Z  * [new branch]                xmfan/ca_nested         -> origin/xmfan/ca_nested
2025-12-04T11:11:09.6463068Z  * [new branch]                xmfan/ca_overhead       -> origin/xmfan/ca_overhead
2025-12-04T11:11:09.6463163Z  * [new branch]                xmfan/ca_overhead_0eba7e5451 -> origin/xmfan/ca_overhead_0eba7e5451
2025-12-04T11:11:09.6463233Z  * [new branch]                xmfan/cacu_jun18        -> origin/xmfan/cacu_jun18
2025-12-04T11:11:09.6463301Z  * [new branch]                xmfan/cacu_jun19        -> origin/xmfan/cacu_jun19
2025-12-04T11:11:09.6463371Z  * [new branch]                xmfan/cacu_jun4         -> origin/xmfan/cacu_jun4
2025-12-04T11:11:09.6463454Z  * [new branch]                xmfan/disable_duck_shape -> origin/xmfan/disable_duck_shape
2025-12-04T11:11:09.6463554Z  * [new branch]                xmfan/fca_cpp_node_passthrough -> origin/xmfan/fca_cpp_node_passthrough
2025-12-04T11:11:09.6463712Z  * [new branch]                xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/post_3945954741e2d37023c5d6954f9483008e0892f9
2025-12-04T11:11:09.6463863Z  * [new branch]                xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9 -> origin/xmfan/pre_3945954741e2d37023c5d6954f9483008e0892f9
2025-12-04T11:11:09.6463936Z  * [new branch]                xmfan/single_step       -> origin/xmfan/single_step
2025-12-04T11:11:09.6464006Z  * [new branch]                xmfan/sth_0829          -> origin/xmfan/sth_0829
2025-12-04T11:11:09.6464068Z  * [new branch]                xmfan/test              -> origin/xmfan/test
2025-12-04T11:11:09.6464157Z  * [new branch]                yguo/debug-0226-constexpr -> origin/yguo/debug-0226-constexpr
2025-12-04T11:11:09.6464238Z  * [new branch]                yguo/new_latest_changes -> origin/yguo/new_latest_changes
2025-12-04T11:11:09.6464334Z  * [new branch]                yguo/patch_constexpr_changes -> origin/yguo/patch_constexpr_changes
2025-12-04T11:11:09.6464404Z  * [new branch]                yiming/bootcamp         -> origin/yiming/bootcamp
2025-12-04T11:11:09.6464508Z  * [new branch]                yiming/run_with_start_end_rng_hop -> origin/yiming/run_with_start_end_rng_hop
2025-12-04T11:11:09.6464576Z  * [new branch]                yolo-llama3             -> origin/yolo-llama3
2025-12-04T11:11:09.6464650Z  * [new branch]                zainr/canary-test       -> origin/zainr/canary-test
2025-12-04T11:11:09.6464743Z  * [new branch]                zainr/cleanup-gh-runners -> origin/zainr/cleanup-gh-runners
2025-12-04T11:11:09.6464823Z  * [new branch]                zainr/pull-migration-c  -> origin/zainr/pull-migration-c
2025-12-04T11:11:09.6464889Z  * [new branch]                zainr/test2             -> origin/zainr/test2
2025-12-04T11:11:09.6464963Z  * [new branch]                zasdfgbnm-patch-3       -> origin/zasdfgbnm-patch-3
2025-12-04T11:11:09.6465023Z  * [new branch]                zb2p                    -> origin/zb2p
2025-12-04T11:11:09.6465132Z  * [new branch]                zeros-and-scatter-part2 -> origin/zeros-and-scatter-part2
2025-12-04T11:11:09.6465222Z  * [new branch]                zhxchen17/ci/vllm_lora_oom -> origin/zhxchen17/ci/vllm_lora_oom
2025-12-04T11:11:09.6465327Z  * [new branch]                zhxchen17/ci/vllm_multimodal_oom -> origin/zhxchen17/ci/vllm_multimodal_oom
2025-12-04T11:11:09.6465431Z  * [new branch]                zhxchen17/ci/vllm_pin   -> origin/zhxchen17/ci/vllm_pin
2025-12-04T11:11:09.6465556Z  * [new branch]                zhxchen17/dynamo/unsafe_drop_all_guards -> origin/zhxchen17/dynamo/unsafe_drop_all_guards
2025-12-04T11:11:09.6465656Z  * [new branch]                zhxchen17/export/call_override -> origin/zhxchen17/export/call_override
2025-12-04T11:11:09.6465747Z  * [new branch]                zhxchen17/export/codemod1 -> origin/zhxchen17/export/codemod1
2025-12-04T11:11:09.6465838Z  * [new branch]                zhxchen17/export/ctx_return -> origin/zhxchen17/export/ctx_return
2025-12-04T11:11:09.6465967Z  * [new branch]                zhxchen17/export/disable_side_effect_warn -> origin/zhxchen17/export/disable_side_effect_warn
2025-12-04T11:11:09.6466068Z  * [new branch]                zhxchen17/export/pytree_check -> origin/zhxchen17/export/pytree_check
2025-12-04T11:11:09.6466158Z  * [new branch]                zhxchen17/precompile/aoti -> origin/zhxchen17/precompile/aoti
2025-12-04T11:11:09.6466255Z  * [new branch]                zhxchen17/precompile/globals -> origin/zhxchen17/precompile/globals
2025-12-04T11:11:09.6466374Z  * [new branch]                zhxchen17/precompile/inductor_guards -> origin/zhxchen17/precompile/inductor_guards
2025-12-04T11:11:09.6466449Z  * [new branch]                zhxchen17/scratch/0     -> origin/zhxchen17/scratch/0
2025-12-04T11:11:09.6466556Z  * [new branch]                zhxchen17/torch_export_api_update -> origin/zhxchen17/torch_export_api_update
2025-12-04T11:11:09.6466633Z  * [new branch]                zhxhcen17/moodycamel    -> origin/zhxhcen17/moodycamel
2025-12-04T11:11:09.6466710Z  * [new branch]                zxiiro/build-times      -> origin/zxiiro/build-times
2025-12-04T11:11:09.6466787Z  * [new branch]                zxiiro/c7i.2xlarge      -> origin/zxiiro/c7i.2xlarge
2025-12-04T11:11:09.6466867Z  * [new branch]                zxiiro/c7i.2xlarge.h100 -> origin/zxiiro/c7i.2xlarge.h100
2025-12-04T11:11:09.6466932Z  * [new branch]                zxiiro/main             -> origin/zxiiro/main
2025-12-04T11:11:09.6466998Z  * [new branch]                zxiiro/risc64           -> origin/zxiiro/risc64
2025-12-04T11:11:09.6467091Z  * [new branch]                zxiiro/test-multicloud-arc -> origin/zxiiro/test-multicloud-arc
2025-12-04T11:11:09.8546655Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object}
2025-12-04T11:11:09.8714681Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:09.8719554Z ##[endgroup]
2025-12-04T11:11:09.8720037Z ##[group]Determining the checkout info
2025-12-04T11:11:09.8720413Z ##[endgroup]
2025-12-04T11:11:09.8724549Z [command]/usr/bin/git sparse-checkout disable
2025-12-04T11:11:09.8817988Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig
2025-12-04T11:11:09.8833725Z ##[group]Checking out the ref
2025-12-04T11:11:09.8835213Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:09.9954002Z Previous HEAD position was c0cb6e784044 [DTensor] ExplicitRedistributionContext warning mode (#169452)
2025-12-04T11:11:09.9959939Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919)
2025-12-04T11:11:10.0074094Z ##[endgroup]
2025-12-04T11:11:10.0074536Z ##[group]Setting up auth for fetching submodules
2025-12-04T11:11:10.0080714Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic ***
2025-12-04T11:11:10.0128004Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf
2025-12-04T11:11:10.0153970Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com:
2025-12-04T11:11:10.0178265Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com:
2025-12-04T11:11:10.0197040Z ##[endgroup]
2025-12-04T11:11:10.0197369Z ##[group]Fetching submodules
2025-12-04T11:11:10.0201065Z [command]/usr/bin/git submodule sync --recursive
2025-12-04T11:11:10.0366571Z Synchronizing submodule url for 'android/libs/fbjni'
2025-12-04T11:11:10.0379860Z Synchronizing submodule url for 'third_party/FP16'
2025-12-04T11:11:10.0390751Z Synchronizing submodule url for 'third_party/FXdiv'
2025-12-04T11:11:10.0401268Z Synchronizing submodule url for 'third_party/NNPACK'
2025-12-04T11:11:10.0412051Z Synchronizing submodule url for 'third_party/NVTX'
2025-12-04T11:11:10.0422980Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:10.0433528Z Synchronizing submodule url for 'third_party/XNNPACK'
2025-12-04T11:11:10.0451966Z Synchronizing submodule url for 'third_party/aiter'
2025-12-04T11:11:10.0463721Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:10.0481174Z Synchronizing submodule url for 'third_party/benchmark'
2025-12-04T11:11:10.0492397Z Synchronizing submodule url for 'third_party/composable_kernel'
2025-12-04T11:11:10.0509976Z Synchronizing submodule url for 'third_party/cpp-httplib'
2025-12-04T11:11:10.0522455Z Synchronizing submodule url for 'third_party/cpuinfo'
2025-12-04T11:11:10.0533456Z Synchronizing submodule url for 'third_party/cudnn_frontend'
2025-12-04T11:11:10.0543958Z Synchronizing submodule url for 'third_party/cutlass'
2025-12-04T11:11:10.0560130Z Synchronizing submodule url for 'third_party/fbgemm'
2025-12-04T11:11:10.0572578Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:10.0583534Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:10.0601903Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:10.0612880Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:10.0629828Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:10.0642558Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:10.0653531Z Synchronizing submodule url for 'third_party/fbgemm/external/json'
2025-12-04T11:11:10.0666725Z Synchronizing submodule url for 'third_party/flash-attention'
2025-12-04T11:11:10.0678272Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:10.0693814Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:10.0710131Z Synchronizing submodule url for 'third_party/flatbuffers'
2025-12-04T11:11:10.0722923Z Synchronizing submodule url for 'third_party/fmt'
2025-12-04T11:11:10.0734681Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:10.0746766Z Synchronizing submodule url for 'third_party/gloo'
2025-12-04T11:11:10.0767648Z Synchronizing submodule url for 'third_party/googletest'
2025-12-04T11:11:10.0782843Z Synchronizing submodule url for 'third_party/ideep'
2025-12-04T11:11:10.0794590Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:10.0813301Z Synchronizing submodule url for 'third_party/ittapi'
2025-12-04T11:11:10.0826451Z Synchronizing submodule url for 'third_party/kineto'
2025-12-04T11:11:10.0838865Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:10.0850682Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:10.0863190Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:10.0874446Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:10.0885364Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:10.0898459Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:10.0910663Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:10.0921964Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:10.0932764Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:10.0944044Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:10.0954524Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:10.0975671Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:10.0990660Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:10.1008426Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:10.1019912Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:10.1032836Z Synchronizing submodule url for 'third_party/kleidiai'
2025-12-04T11:11:10.1045502Z Synchronizing submodule url for 'third_party/mimalloc'
2025-12-04T11:11:10.1056181Z Synchronizing submodule url for 'third_party/nlohmann'
2025-12-04T11:11:10.1067503Z Synchronizing submodule url for 'third_party/onnx'
2025-12-04T11:11:10.1086896Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:10.1100641Z Synchronizing submodule url for 'third_party/opentelemetry-cpp'
2025-12-04T11:11:10.1111721Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:10.1123274Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:10.1135918Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:10.1148464Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:10.1172509Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:10.1185882Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:10.1195828Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:10.1206019Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:10.1218286Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:10.1231368Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:10.1252563Z Synchronizing submodule url for 'third_party/pocketfft'
2025-12-04T11:11:10.1264315Z Synchronizing submodule url for 'third_party/protobuf'
2025-12-04T11:11:10.1276429Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:10.1287355Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:10.1302165Z Synchronizing submodule url for 'third_party/psimd'
2025-12-04T11:11:10.1313317Z Synchronizing submodule url for 'third_party/pthreadpool'
2025-12-04T11:11:10.1323872Z Synchronizing submodule url for 'third_party/pybind11'
2025-12-04T11:11:10.1334799Z Synchronizing submodule url for 'third_party/python-peachpy'
2025-12-04T11:11:10.1345971Z Synchronizing submodule url for 'third_party/sleef'
2025-12-04T11:11:10.1356530Z Synchronizing submodule url for 'third_party/tensorpipe'
2025-12-04T11:11:10.1366666Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:10.1379533Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:10.1390433Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:10.1403032Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:10.1413193Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:10.1435727Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive
2025-12-04T11:11:10.1672011Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f'
2025-12-04T11:11:10.1752004Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3'
2025-12-04T11:11:10.1801493Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1'
2025-12-04T11:11:10.1917552Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73'
2025-12-04T11:11:10.1984050Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6'
2025-12-04T11:11:10.2046363Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1'
2025-12-04T11:11:10.7272902Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883'
2025-12-04T11:11:10.7422333Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150'
2025-12-04T11:11:10.7629576Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf'
2025-12-04T11:11:10.7752030Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f'
2025-12-04T11:11:10.7923666Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977'
2025-12-04T11:11:10.8010375Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246'
2025-12-04T11:11:10.8699755Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc'
2025-12-04T11:11:10.8785452Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396'
2025-12-04T11:11:10.8911723Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588'
2025-12-04T11:11:10.9848689Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4'
2025-12-04T11:11:11.0251221Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea'
2025-12-04T11:11:11.2113170Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977'
2025-12-04T11:11:11.2793749Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349'
2025-12-04T11:11:11.4030803Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8'
2025-12-04T11:11:11.4229403Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:11.4297369Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691'
2025-12-04T11:11:11.4834449Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03'
2025-12-04T11:11:11.4932498Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5'
2025-12-04T11:11:11.5110640Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33'
2025-12-04T11:11:11.5216773Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420'
2025-12-04T11:11:11.5307079Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757'
2025-12-04T11:11:11.5454668Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f'
2025-12-04T11:11:11.5655147Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350'
2025-12-04T11:11:11.5768981Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341'
2025-12-04T11:11:11.5958114Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:11.6050379Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3'
2025-12-04T11:11:11.7063402Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d'
2025-12-04T11:11:11.7143105Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959'
2025-12-04T11:11:11.7215409Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943'
2025-12-04T11:11:11.7295286Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1'
2025-12-04T11:11:11.7391520Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9'
2025-12-04T11:11:11.7446703Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400'
2025-12-04T11:11:11.7511638Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05'
2025-12-04T11:11:11.7580090Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067'
2025-12-04T11:11:11.7637152Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4'
2025-12-04T11:11:11.7698874Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446'
2025-12-04T11:11:11.7762082Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:11.7846601Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5'
2025-12-04T11:11:11.7899036Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150'
2025-12-04T11:11:11.7954564Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a'
2025-12-04T11:11:11.8029001Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159'
2025-12-04T11:11:11.8083876Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
2025-12-04T11:11:11.8154579Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21'
2025-12-04T11:11:11.8209117Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:11.8287483Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe'
2025-12-04T11:11:11.8366208Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e'
2025-12-04T11:11:11.8471254Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72'
2025-12-04T11:11:12.0254838Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83'
2025-12-04T11:11:12.0449323Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4'
2025-12-04T11:11:12.0562136Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878'
2025-12-04T11:11:12.0625523Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2'
2025-12-04T11:11:12.0698477Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1'
2025-12-04T11:11:12.0755630Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa'
2025-12-04T11:11:12.0846109Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d'
2025-12-04T11:11:12.0923044Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce'
2025-12-04T11:11:12.1017791Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5'
2025-12-04T11:11:12.1174865Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d'
2025-12-04T11:11:12.1259565Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4'
2025-12-04T11:11:12.1314352Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
2025-12-04T11:11:12.1482493Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50'
2025-12-04T11:11:12.1550859Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa'
2025-12-04T11:11:12.3211915Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a'
2025-12-04T11:11:12.3308903Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8'
2025-12-04T11:11:12.3524868Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081'
2025-12-04T11:11:12.3583810Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900'
2025-12-04T11:11:12.3662798Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8'
2025-12-04T11:11:12.3918327Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8'
2025-12-04T11:11:12.4160282Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67'
2025-12-04T11:11:12.4571601Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68'
2025-12-04T11:11:12.4678503Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d'
2025-12-04T11:11:12.4876054Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e'
2025-12-04T11:11:12.4957474Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281'
2025-12-04T11:11:12.5239889Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b'
2025-12-04T11:11:12.5358981Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef'
2025-12-04T11:11:12.5416104Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5'
2025-12-04T11:11:12.5439818Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0
2025-12-04T11:11:12.5627125Z Entering 'android/libs/fbjni'
2025-12-04T11:11:12.5652011Z Entering 'third_party/FP16'
2025-12-04T11:11:12.5675592Z Entering 'third_party/FXdiv'
2025-12-04T11:11:12.5700219Z Entering 'third_party/NNPACK'
2025-12-04T11:11:12.5724551Z Entering 'third_party/NVTX'
2025-12-04T11:11:12.5750185Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:12.5772786Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:12.5801671Z Entering 'third_party/aiter'
2025-12-04T11:11:12.5825317Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:12.5855548Z Entering 'third_party/benchmark'
2025-12-04T11:11:12.5879542Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:12.5906620Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:12.5929222Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:12.5954981Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:12.5977595Z Entering 'third_party/cutlass'
2025-12-04T11:11:12.6004582Z Entering 'third_party/fbgemm'
2025-12-04T11:11:12.6029596Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:12.6048238Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:12.6073818Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:12.6094452Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:12.6119792Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:12.6141785Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:12.6164282Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:12.6187049Z Entering 'third_party/flash-attention'
2025-12-04T11:11:12.6209591Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:12.6231902Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:12.6257454Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:12.6280774Z Entering 'third_party/fmt'
2025-12-04T11:11:12.6302446Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:12.6324943Z Entering 'third_party/gloo'
2025-12-04T11:11:12.6347050Z Entering 'third_party/googletest'
2025-12-04T11:11:12.6368434Z Entering 'third_party/ideep'
2025-12-04T11:11:12.6390561Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:12.6414712Z Entering 'third_party/ittapi'
2025-12-04T11:11:12.6437094Z Entering 'third_party/kineto'
2025-12-04T11:11:12.6458848Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:12.6479665Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:12.6501422Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:12.6523391Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:12.6544264Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:12.6572068Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:12.6595738Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:12.6616127Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:12.6637582Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:12.6658459Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:12.6680028Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:12.6701506Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:12.6722550Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:12.6747344Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:12.6767993Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:12.6790218Z Entering 'third_party/kleidiai'
2025-12-04T11:11:12.6812931Z Entering 'third_party/mimalloc'
2025-12-04T11:11:12.6834658Z Entering 'third_party/nlohmann'
2025-12-04T11:11:12.6857280Z Entering 'third_party/onnx'
2025-12-04T11:11:12.6887117Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:12.6915538Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:12.6939644Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:12.6960389Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:12.6981779Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:12.7003123Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:12.7025284Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:12.7054476Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:12.7081008Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:12.7101309Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:12.7125936Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:12.7155452Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:12.7183200Z Entering 'third_party/pocketfft'
2025-12-04T11:11:12.7206406Z Entering 'third_party/protobuf'
2025-12-04T11:11:12.7233061Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:12.7255194Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:12.7278918Z Entering 'third_party/psimd'
2025-12-04T11:11:12.7301499Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:12.7327290Z Entering 'third_party/pybind11'
2025-12-04T11:11:12.7348917Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:12.7372917Z Entering 'third_party/sleef'
2025-12-04T11:11:12.7396252Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:12.7418412Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:12.7439184Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:12.7460214Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:12.7480746Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:12.7501945Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:12.7535691Z ##[endgroup]
2025-12-04T11:11:12.7535890Z ##[group]Persisting credentials for submodules
2025-12-04T11:11:12.7541046Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :"
2025-12-04T11:11:12.7700192Z Entering 'android/libs/fbjni'
2025-12-04T11:11:12.7725264Z Entering 'third_party/FP16'
2025-12-04T11:11:12.7753733Z Entering 'third_party/FXdiv'
2025-12-04T11:11:12.7778265Z Entering 'third_party/NNPACK'
2025-12-04T11:11:12.7804288Z Entering 'third_party/NVTX'
2025-12-04T11:11:12.7829500Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:12.7853137Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:12.7883166Z Entering 'third_party/aiter'
2025-12-04T11:11:12.7909455Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:12.7938121Z Entering 'third_party/benchmark'
2025-12-04T11:11:12.7970759Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:12.7999975Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:12.8024369Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:12.8049193Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:12.8074337Z Entering 'third_party/cutlass'
2025-12-04T11:11:12.8102799Z Entering 'third_party/fbgemm'
2025-12-04T11:11:12.8129327Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:12.8151565Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:12.8177814Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:12.8200870Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:12.8229391Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:12.8253422Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:12.8276907Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:12.8306047Z Entering 'third_party/flash-attention'
2025-12-04T11:11:12.8331759Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:12.8361189Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:12.8391338Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:12.8416655Z Entering 'third_party/fmt'
2025-12-04T11:11:12.8441921Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:12.8466455Z Entering 'third_party/gloo'
2025-12-04T11:11:12.8491562Z Entering 'third_party/googletest'
2025-12-04T11:11:12.8517962Z Entering 'third_party/ideep'
2025-12-04T11:11:12.8543705Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:12.8571331Z Entering 'third_party/ittapi'
2025-12-04T11:11:12.8600427Z Entering 'third_party/kineto'
2025-12-04T11:11:12.8624099Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:12.8648625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:12.8672604Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:12.8696615Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:12.8720922Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:12.8744619Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:12.8770393Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:12.8793492Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:12.8816155Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:12.8841162Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:12.8864320Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:12.8888139Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:12.8915323Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:12.8943812Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:12.8967538Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:12.8993428Z Entering 'third_party/kleidiai'
2025-12-04T11:11:12.9017277Z Entering 'third_party/mimalloc'
2025-12-04T11:11:12.9043379Z Entering 'third_party/nlohmann'
2025-12-04T11:11:12.9067875Z Entering 'third_party/onnx'
2025-12-04T11:11:12.9098289Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:12.9129695Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:12.9153821Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:12.9181634Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:12.9209617Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:12.9255991Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:12.9291033Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:12.9314896Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:12.9343434Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:12.9367252Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:12.9400550Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:12.9426764Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:12.9458740Z Entering 'third_party/pocketfft'
2025-12-04T11:11:12.9483307Z Entering 'third_party/protobuf'
2025-12-04T11:11:12.9513354Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:12.9537444Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:12.9565271Z Entering 'third_party/psimd'
2025-12-04T11:11:12.9590472Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:12.9616556Z Entering 'third_party/pybind11'
2025-12-04T11:11:12.9641355Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:12.9665953Z Entering 'third_party/sleef'
2025-12-04T11:11:12.9690790Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:12.9718590Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:12.9740731Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:12.9764433Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:12.9792495Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:12.9816037Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:12.9853544Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url"
2025-12-04T11:11:13.0039428Z Entering 'android/libs/fbjni'
2025-12-04T11:11:13.0062575Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config	remote.origin.url
2025-12-04T11:11:13.0073513Z Entering 'third_party/FP16'
2025-12-04T11:11:13.0099948Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config	remote.origin.url
2025-12-04T11:11:13.0110972Z Entering 'third_party/FXdiv'
2025-12-04T11:11:13.0132257Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config	remote.origin.url
2025-12-04T11:11:13.0142578Z Entering 'third_party/NNPACK'
2025-12-04T11:11:13.0166292Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config	remote.origin.url
2025-12-04T11:11:13.0176876Z Entering 'third_party/NVTX'
2025-12-04T11:11:13.0197372Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config	remote.origin.url
2025-12-04T11:11:13.0207955Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:13.0230923Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config	remote.origin.url
2025-12-04T11:11:13.0241474Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:13.0262802Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config	remote.origin.url
2025-12-04T11:11:13.0277546Z Entering 'third_party/aiter'
2025-12-04T11:11:13.0302593Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config	remote.origin.url
2025-12-04T11:11:13.0314898Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:13.0336364Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config	remote.origin.url
2025-12-04T11:11:13.0354199Z Entering 'third_party/benchmark'
2025-12-04T11:11:13.0374823Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:13.0385482Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:13.0407757Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config	remote.origin.url
2025-12-04T11:11:13.0420514Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:13.0440968Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config	remote.origin.url
2025-12-04T11:11:13.0451435Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:13.0474225Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config	remote.origin.url
2025-12-04T11:11:13.0484939Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:13.0506273Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config	remote.origin.url
2025-12-04T11:11:13.0518718Z Entering 'third_party/cutlass'
2025-12-04T11:11:13.0570377Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config	remote.origin.url
2025-12-04T11:11:13.0635841Z Entering 'third_party/fbgemm'
2025-12-04T11:11:13.0669571Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config	remote.origin.url
2025-12-04T11:11:13.0704285Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:13.0725442Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config	remote.origin.url
2025-12-04T11:11:13.0732281Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:13.0758677Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config	remote.origin.url
2025-12-04T11:11:13.0775893Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:13.0801016Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config	remote.origin.url
2025-12-04T11:11:13.0808898Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:13.0824927Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config	remote.origin.url
2025-12-04T11:11:13.0835633Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:13.0853996Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config	remote.origin.url
2025-12-04T11:11:13.0862151Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:13.0879974Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config	remote.origin.url
2025-12-04T11:11:13.0889004Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:13.0906447Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config	remote.origin.url
2025-12-04T11:11:13.0920075Z Entering 'third_party/flash-attention'
2025-12-04T11:11:13.0941435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config	remote.origin.url
2025-12-04T11:11:13.0953071Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:13.0973189Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config	remote.origin.url
2025-12-04T11:11:13.0984645Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:13.1003546Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config	remote.origin.url
2025-12-04T11:11:13.1020027Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:13.1041354Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config	remote.origin.url
2025-12-04T11:11:13.1054774Z Entering 'third_party/fmt'
2025-12-04T11:11:13.1074764Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:13.1086033Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:13.1106598Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config	remote.origin.url
2025-12-04T11:11:13.1117150Z Entering 'third_party/gloo'
2025-12-04T11:11:13.1137787Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config	remote.origin.url
2025-12-04T11:11:13.1149606Z Entering 'third_party/googletest'
2025-12-04T11:11:13.1169038Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:13.1180390Z Entering 'third_party/ideep'
2025-12-04T11:11:13.1199446Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config	remote.origin.url
2025-12-04T11:11:13.1209547Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:13.1227776Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config	remote.origin.url
2025-12-04T11:11:13.1240955Z Entering 'third_party/ittapi'
2025-12-04T11:11:13.1260798Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config	remote.origin.url
2025-12-04T11:11:13.1272037Z Entering 'third_party/kineto'
2025-12-04T11:11:13.1291628Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config	remote.origin.url
2025-12-04T11:11:13.1302689Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:13.1320839Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config	remote.origin.url
2025-12-04T11:11:13.1329209Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:13.1350304Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config	remote.origin.url
2025-12-04T11:11:13.1359688Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:13.1378577Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config	remote.origin.url
2025-12-04T11:11:13.1386958Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:13.1406092Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:13.1414758Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:13.1434251Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config	remote.origin.url
2025-12-04T11:11:13.1442189Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:13.1461739Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config	remote.origin.url
2025-12-04T11:11:13.1471132Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:13.1490043Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config	remote.origin.url
2025-12-04T11:11:13.1499021Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:13.1517405Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:13.1527939Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:13.1547579Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config	remote.origin.url
2025-12-04T11:11:13.1556946Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:13.1575678Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config	remote.origin.url
2025-12-04T11:11:13.1584464Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:13.1602742Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:13.1611252Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:13.1630596Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:13.1640599Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:13.1659718Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:13.1669993Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:13.1689168Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:13.1697441Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:13.1716716Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:13.1729724Z Entering 'third_party/kleidiai'
2025-12-04T11:11:13.1755346Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config	remote.origin.url
2025-12-04T11:11:13.1770993Z Entering 'third_party/mimalloc'
2025-12-04T11:11:13.1795150Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config	remote.origin.url
2025-12-04T11:11:13.1806894Z Entering 'third_party/nlohmann'
2025-12-04T11:11:13.1827686Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config	remote.origin.url
2025-12-04T11:11:13.1840070Z Entering 'third_party/onnx'
2025-12-04T11:11:13.1860402Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config	remote.origin.url
2025-12-04T11:11:13.1879469Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:13.1899490Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:13.1910423Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:13.1931521Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config	remote.origin.url
2025-12-04T11:11:13.1942802Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:13.1961683Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:13.1970257Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:13.1988328Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:13.1996429Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:13.2014817Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config	remote.origin.url
2025-12-04T11:11:13.2023624Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:13.2041577Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config	remote.origin.url
2025-12-04T11:11:13.2050843Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:13.2068547Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config	remote.origin.url
2025-12-04T11:11:13.2076760Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:13.2094999Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config	remote.origin.url
2025-12-04T11:11:13.2103311Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:13.2121254Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:13.2129622Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:13.2147859Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:13.2157243Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:13.2175412Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:13.2185025Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:13.2203138Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config	remote.origin.url
2025-12-04T11:11:13.2221573Z Entering 'third_party/pocketfft'
2025-12-04T11:11:13.2241056Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config	remote.origin.url
2025-12-04T11:11:13.2251338Z Entering 'third_party/protobuf'
2025-12-04T11:11:13.2269678Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config	remote.origin.url
2025-12-04T11:11:13.2280731Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:13.2298677Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:13.2307056Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:13.2325309Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:13.2335522Z Entering 'third_party/psimd'
2025-12-04T11:11:13.2355681Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config	remote.origin.url
2025-12-04T11:11:13.2366195Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:13.2385269Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config	remote.origin.url
2025-12-04T11:11:13.2395259Z Entering 'third_party/pybind11'
2025-12-04T11:11:13.2413860Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:13.2424576Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:13.2443411Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config	remote.origin.url
2025-12-04T11:11:13.2453793Z Entering 'third_party/sleef'
2025-12-04T11:11:13.2472549Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config	remote.origin.url
2025-12-04T11:11:13.2483437Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:13.2502554Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config	remote.origin.url
2025-12-04T11:11:13.2512850Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:13.2531895Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:13.2540632Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:13.2558434Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config	remote.origin.url
2025-12-04T11:11:13.2566903Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:13.2587530Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config	remote.origin.url
2025-12-04T11:11:13.2596420Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:13.2615040Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:13.2622875Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:13.2642238Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config	remote.origin.url
2025-12-04T11:11:13.2840763Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:'
2025-12-04T11:11:13.3026272Z Entering 'android/libs/fbjni'
2025-12-04T11:11:13.3047339Z Entering 'third_party/FP16'
2025-12-04T11:11:13.3070341Z Entering 'third_party/FXdiv'
2025-12-04T11:11:13.3092509Z Entering 'third_party/NNPACK'
2025-12-04T11:11:13.3114198Z Entering 'third_party/NVTX'
2025-12-04T11:11:13.3136707Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:13.3157204Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:13.3184067Z Entering 'third_party/aiter'
2025-12-04T11:11:13.3206096Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:13.3229827Z Entering 'third_party/benchmark'
2025-12-04T11:11:13.3251040Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:13.3281361Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:13.3302406Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:13.3324533Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:13.3346978Z Entering 'third_party/cutlass'
2025-12-04T11:11:13.3383016Z Entering 'third_party/fbgemm'
2025-12-04T11:11:13.3411734Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:13.3430422Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:13.3451760Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:13.3470275Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:13.3488044Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:13.3503061Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:13.3533005Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:13.3566115Z Entering 'third_party/flash-attention'
2025-12-04T11:11:13.3597855Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:13.3625646Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:13.3657424Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:13.3683735Z Entering 'third_party/fmt'
2025-12-04T11:11:13.3707717Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:13.3742577Z Entering 'third_party/gloo'
2025-12-04T11:11:13.3766634Z Entering 'third_party/googletest'
2025-12-04T11:11:13.3789049Z Entering 'third_party/ideep'
2025-12-04T11:11:13.3822079Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:13.3852838Z Entering 'third_party/ittapi'
2025-12-04T11:11:13.3879326Z Entering 'third_party/kineto'
2025-12-04T11:11:13.3907353Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:13.3948052Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:13.3970736Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:13.3986052Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:13.4004528Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:13.4023663Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:13.4053182Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:13.4069971Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:13.4086747Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:13.4106252Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:13.4126186Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:13.4154481Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:13.4172426Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:13.4203336Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:13.4241729Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:13.4274681Z Entering 'third_party/kleidiai'
2025-12-04T11:11:13.4302987Z Entering 'third_party/mimalloc'
2025-12-04T11:11:13.4331232Z Entering 'third_party/nlohmann'
2025-12-04T11:11:13.4359423Z Entering 'third_party/onnx'
2025-12-04T11:11:13.4389963Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:13.4423514Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:13.4447479Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:13.4469436Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:13.4492169Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:13.4516450Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:13.4538228Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:13.4564685Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:13.4585820Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:13.4609468Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:13.4631724Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:13.4660036Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:13.4695232Z Entering 'third_party/pocketfft'
2025-12-04T11:11:13.4718666Z Entering 'third_party/protobuf'
2025-12-04T11:11:13.4751225Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:13.4773196Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:13.4803812Z Entering 'third_party/psimd'
2025-12-04T11:11:13.4826404Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:13.4848085Z Entering 'third_party/pybind11'
2025-12-04T11:11:13.4873111Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:13.4898930Z Entering 'third_party/sleef'
2025-12-04T11:11:13.4924464Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:13.4947355Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:13.4967315Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:13.4993145Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:13.5014585Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:13.5035761Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:13.5073841Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:'
2025-12-04T11:11:13.5242483Z Entering 'android/libs/fbjni'
2025-12-04T11:11:13.5267698Z Entering 'third_party/FP16'
2025-12-04T11:11:13.5288672Z Entering 'third_party/FXdiv'
2025-12-04T11:11:13.5311431Z Entering 'third_party/NNPACK'
2025-12-04T11:11:13.5334335Z Entering 'third_party/NVTX'
2025-12-04T11:11:13.5355649Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:13.5375927Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:13.5403544Z Entering 'third_party/aiter'
2025-12-04T11:11:13.5426584Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:13.5451439Z Entering 'third_party/benchmark'
2025-12-04T11:11:13.5475041Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:13.5499902Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:13.5521621Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:13.5543441Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:13.5570253Z Entering 'third_party/cutlass'
2025-12-04T11:11:13.5598460Z Entering 'third_party/fbgemm'
2025-12-04T11:11:13.5619724Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:13.5640014Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:13.5668453Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:13.5688649Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:13.5713867Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:13.5733603Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:13.5753117Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:13.5775091Z Entering 'third_party/flash-attention'
2025-12-04T11:11:13.5798935Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:13.5821904Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:13.5847531Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:13.5871548Z Entering 'third_party/fmt'
2025-12-04T11:11:13.5893585Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:13.5916212Z Entering 'third_party/gloo'
2025-12-04T11:11:13.5939374Z Entering 'third_party/googletest'
2025-12-04T11:11:13.5964593Z Entering 'third_party/ideep'
2025-12-04T11:11:13.5985466Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:13.6012839Z Entering 'third_party/ittapi'
2025-12-04T11:11:13.6035607Z Entering 'third_party/kineto'
2025-12-04T11:11:13.6056390Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:13.6078241Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:13.6100185Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:13.6126329Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:13.6148387Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:13.6170422Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:13.6193800Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:13.6215581Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:13.6239405Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:13.6261909Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:13.6282934Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:13.6304064Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:13.6327169Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:13.6352647Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:13.6374880Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:13.6397868Z Entering 'third_party/kleidiai'
2025-12-04T11:11:13.6422816Z Entering 'third_party/mimalloc'
2025-12-04T11:11:13.6443098Z Entering 'third_party/nlohmann'
2025-12-04T11:11:13.6466501Z Entering 'third_party/onnx'
2025-12-04T11:11:13.6494219Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:13.6519937Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:13.6547287Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:13.6568066Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:13.6590650Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:13.6616735Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:13.6640977Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:13.6666428Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:13.6688896Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:13.6713113Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:13.6740539Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:13.6768947Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:13.6799872Z Entering 'third_party/pocketfft'
2025-12-04T11:11:13.6821502Z Entering 'third_party/protobuf'
2025-12-04T11:11:13.6843484Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:13.6868039Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:13.6891566Z Entering 'third_party/psimd'
2025-12-04T11:11:13.6918597Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:13.6941426Z Entering 'third_party/pybind11'
2025-12-04T11:11:13.6965340Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:13.6986477Z Entering 'third_party/sleef'
2025-12-04T11:11:13.7006881Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:13.7029145Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:13.7049702Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:13.7069612Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:13.7096827Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:13.7116360Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:13.7152573Z ##[endgroup]
2025-12-04T11:11:13.7319994Z [command]/usr/bin/git log -1 --format=%H
2025-12-04T11:11:13.7412437Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:13.7535987Z ##[group]Run actions/checkout@v4
2025-12-04T11:11:13.7536125Z with:
2025-12-04T11:11:13.7536233Z   ref: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:13.7536367Z   fetch-depth: 0
2025-12-04T11:11:13.7536468Z   submodules: recursive
2025-12-04T11:11:13.7536569Z   show-progress: false
2025-12-04T11:11:13.7536689Z   repository: pytorch/pytorch
2025-12-04T11:11:13.7536845Z   token: ***
2025-12-04T11:11:13.7536936Z   ssh-strict: true
2025-12-04T11:11:13.7537035Z   ssh-user: git
2025-12-04T11:11:13.7537134Z   persist-credentials: true
2025-12-04T11:11:13.7537244Z   clean: true
2025-12-04T11:11:13.7537347Z   sparse-checkout-cone-mode: true
2025-12-04T11:11:13.7537476Z   fetch-tags: false
2025-12-04T11:11:13.7537577Z   lfs: false
2025-12-04T11:11:13.7537670Z   set-safe-directory: true
2025-12-04T11:11:13.7537773Z env:
2025-12-04T11:11:13.7537868Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:13.7537985Z ##[endgroup]
2025-12-04T11:11:13.8004499Z Syncing repository: pytorch/pytorch
2025-12-04T11:11:13.8004819Z ##[group]Getting Git version info
2025-12-04T11:11:13.8005025Z Working directory is '/home/runner/_work/pytorch/pytorch'
2025-12-04T11:11:13.8019041Z [command]/usr/bin/git version
2025-12-04T11:11:13.8047492Z git version 2.52.0
2025-12-04T11:11:13.8063612Z ##[endgroup]
2025-12-04T11:11:13.8070036Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/3665f695-7bbc-4c6c-b9ba-d298500bfe04/.gitconfig'
2025-12-04T11:11:13.8076131Z Temporarily overriding HOME='/home/runner/_work/_temp/3665f695-7bbc-4c6c-b9ba-d298500bfe04' before making global git config changes
2025-12-04T11:11:13.8076669Z Adding repository directory to the temporary git global config as a safe directory
2025-12-04T11:11:13.8079556Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch
2025-12-04T11:11:13.8104292Z [command]/usr/bin/git config --local --get remote.origin.url
2025-12-04T11:11:13.8121026Z https://github.com/pytorch/pytorch
2025-12-04T11:11:13.8135932Z ##[group]Removing previously created refs, to avoid conflicts
2025-12-04T11:11:13.8139452Z [command]/usr/bin/git rev-parse --symbolic-full-name --verify --quiet HEAD
2025-12-04T11:11:13.8154075Z HEAD
2025-12-04T11:11:13.8177910Z ##[endgroup]
2025-12-04T11:11:13.8179535Z [command]/usr/bin/git submodule status
2025-12-04T11:11:13.8386517Z  7e1e1fe3858c63c251c637ae41a20de425dde96f android/libs/fbjni (v0.1.0-12-g7e1e1fe)
2025-12-04T11:11:13.8434322Z  4dfe081cf6bcd15db339cf2680b9281b8451eeb3 third_party/FP16 (4dfe081)
2025-12-04T11:11:13.8486678Z  b408327ac2a15ec3e43352421954f5b1967701d1 third_party/FXdiv (b408327)
2025-12-04T11:11:13.8539952Z  c07e3a0400713d546e0dea2d5466dd22ea389c73 third_party/NNPACK (c07e3a0)
2025-12-04T11:11:13.8571860Z  3ebbc93ded7285963bff932c678fa367eb393ba6 third_party/NVTX (v3.1.0-313-g3ebbc93)
2025-12-04T11:11:13.8624332Z  1d8f600fd424278486eade7ed3e877c99f0846b1 third_party/VulkanMemoryAllocator (v2.1.0-982-g1d8f600)
2025-12-04T11:11:13.8906886Z  51a0103656eff6fc9bfd39a4597923c4b542c883 third_party/XNNPACK (remotes/origin/ds/ndk-1243-g51a0103656)
2025-12-04T11:11:13.8931261Z  01aae101b9e5e94d6c16a9514c9fb8df99c93150 third_party/aiter (v0.1.1-92-g01aae101)
2025-12-04T11:11:13.8949101Z  299e5928955cc62af9968370293b916f5130916f third_party/benchmark (v1.9.3)
2025-12-04T11:11:13.9002903Z  7fe50dc3da2069d6645d9deb8c017a876472a977 third_party/composable_kernel (rocm-6.4.3-459-g7fe50dc3d)
2025-12-04T11:11:13.9083843Z  89c932f313c6437c38f2982869beacc89c2f2246 third_party/cpp-httplib (v0.26.0)
2025-12-04T11:11:13.9163552Z  f858c30bcb16f8effd5ff46996f0514539e17abc third_party/cpuinfo (f858c30)
2025-12-04T11:11:13.9184858Z  0b1577c8c83401237d601d0d0db5210506705396 third_party/cudnn_frontend (v0.5-61-g0b1577c)
2025-12-04T11:11:13.9252460Z  f88806b1e31dfa579842638740216dd41fc6c588 third_party/cutlass (v4.3.1)
2025-12-04T11:11:13.9273396Z  c0b988d39a9e47c794d699f29930ed4d7c7e13a4 third_party/fbgemm (v1.4.0-rc1-2-gc0b988d39)
2025-12-04T11:11:13.9337289Z  979702c87a8713a8e0a5e9fee122b90d2ef13be5 third_party/flash-attention (v2.7.4)
2025-12-04T11:11:13.9353729Z  a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757 third_party/flatbuffers (v24.12.23)
2025-12-04T11:11:13.9672309Z  407c905e45ad75fc29bf0f9bb7c5c2fd3475976f third_party/fmt (12.1.0)
2025-12-04T11:11:13.9758831Z  3fb5c176c17c765a3492cd2f0321b0dab712f350 third_party/gemmlowp/gemmlowp (remotes/origin/revert-87-master-135-g3fb5c17)
2025-12-04T11:11:13.9841606Z  54cbae0d3a67fa890b4c3d9ee162b7860315e341 third_party/gloo (remotes/origin/gh/c-p-i-o/1/base-37-g54cbae0)
2025-12-04T11:11:14.0021628Z  52eb8108c5bdec04579160ae17225d66034bd723 third_party/googletest (release-1.8.0-3544-g52eb8108)
2025-12-04T11:11:14.0084261Z  719d8e6cd7f7a0e01b155657526d693acf97c2b3 third_party/ideep (pytorch-rls-v3.7.1)
2025-12-04T11:11:14.0125389Z  dec1d23ca65ab069d225dfe40dea14f455170959 third_party/ittapi (v3.25.5)
2025-12-04T11:11:14.0270228Z  31f85df8fbd89c188f14ef10f1ec65379786b943 third_party/kineto (heads/main)
2025-12-04T11:11:14.0287914Z  d7770c89632329a9914ef1a90289917597639cbe third_party/kleidiai (v1.15.0)
2025-12-04T11:11:14.0303829Z  fbd8b99c2b828428947d70fdc046bb55609be93e third_party/mimalloc (v2.2.4)
2025-12-04T11:11:14.0320037Z  55f93686c01528224f448c19128836e7df245f72 third_party/nlohmann (v3.12.0)
2025-12-04T11:11:14.0582941Z  e709452ef2bbc1d113faf678c24e6d3467696e83 third_party/onnx (v1.18.0)
2025-12-04T11:11:14.0603098Z  a799f4aed9c94b765dcdaabaeab7d5e7e2310878 third_party/opentelemetry-cpp (v1.14.2)
2025-12-04T11:11:14.0622668Z  0fa0ef591e38c2758e3184c6c23e497b9f732ffa third_party/pocketfft (release_for_eigen-40-g0fa0ef5)
2025-12-04T11:11:14.0896114Z  d1eca4e4b421cd2997495c4b4e65cea6be4e9b8a third_party/protobuf (v3.7.0-rc.2-1279-gd1eca4e4b)
2025-12-04T11:11:14.0966763Z  072586a71b55b7f8c584153d223e95687148a900 third_party/psimd (heads/master)
2025-12-04T11:11:14.1010706Z  4fe0e1e183925bf8cfa6aae24237e724a96479b8 third_party/pthreadpool (0.1-144-g4fe0e1e)
2025-12-04T11:11:14.1032250Z  f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8 third_party/pybind11 (v3.0.1)
2025-12-04T11:11:14.1082155Z  f45429b087dd7d5bc78bb40dc7cf06425c252d67 third_party/python-peachpy (remotes/origin/pre-generated)
2025-12-04T11:11:14.1146562Z  5a1d179df9cf652951b59010a2d2075372d67f68 third_party/sleef (3.8)
2025-12-04T11:11:14.1189683Z  2b4cd91092d335a697416b2a3cb398283246849d third_party/tensorpipe (heads/main)
2025-12-04T11:11:14.1206080Z ##[group]Cleaning the repository
2025-12-04T11:11:14.1212014Z [command]/usr/bin/git clean -ffdx
2025-12-04T11:11:14.1368372Z [command]/usr/bin/git reset --hard HEAD
2025-12-04T11:11:14.2175025Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919)
2025-12-04T11:11:14.2243892Z ##[endgroup]
2025-12-04T11:11:14.2246362Z ##[group]Disabling automatic garbage collection
2025-12-04T11:11:14.2251501Z [command]/usr/bin/git config --local gc.auto 0
2025-12-04T11:11:14.2280249Z ##[endgroup]
2025-12-04T11:11:14.2280497Z ##[group]Setting up auth
2025-12-04T11:11:14.2283724Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
2025-12-04T11:11:14.2311014Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
2025-12-04T11:11:14.2502846Z Entering 'android/libs/fbjni'
2025-12-04T11:11:14.2527475Z Entering 'third_party/FP16'
2025-12-04T11:11:14.2553853Z Entering 'third_party/FXdiv'
2025-12-04T11:11:14.2580065Z Entering 'third_party/NNPACK'
2025-12-04T11:11:14.2606645Z Entering 'third_party/NVTX'
2025-12-04T11:11:14.2630499Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:14.2656145Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:14.2686551Z Entering 'third_party/aiter'
2025-12-04T11:11:14.2717154Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:14.2742979Z Entering 'third_party/benchmark'
2025-12-04T11:11:14.2767063Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:14.2794645Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:14.2819519Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:14.2843927Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:14.2869283Z Entering 'third_party/cutlass'
2025-12-04T11:11:14.2898067Z Entering 'third_party/fbgemm'
2025-12-04T11:11:14.2924624Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:14.2976631Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:14.3021769Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:14.3054114Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:14.3079090Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:14.3105227Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:14.3120238Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:14.3165326Z Entering 'third_party/flash-attention'
2025-12-04T11:11:14.3197738Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:14.3216984Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:14.3240678Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:14.3265340Z Entering 'third_party/fmt'
2025-12-04T11:11:14.3292728Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:14.3311866Z Entering 'third_party/gloo'
2025-12-04T11:11:14.3339096Z Entering 'third_party/googletest'
2025-12-04T11:11:14.3365525Z Entering 'third_party/ideep'
2025-12-04T11:11:14.3391260Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:14.3412485Z Entering 'third_party/ittapi'
2025-12-04T11:11:14.3433979Z Entering 'third_party/kineto'
2025-12-04T11:11:14.3459608Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:14.3496254Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:14.3523669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:14.3550794Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:14.3568821Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:14.3588926Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:14.3608539Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:14.3634822Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:14.3668306Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:14.3691797Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:14.3713293Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:14.3731467Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:14.3754662Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:14.3785124Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:14.3817429Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:14.3842083Z Entering 'third_party/kleidiai'
2025-12-04T11:11:14.3865843Z Entering 'third_party/mimalloc'
2025-12-04T11:11:14.3888970Z Entering 'third_party/nlohmann'
2025-12-04T11:11:14.3910873Z Entering 'third_party/onnx'
2025-12-04T11:11:14.3938643Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:14.3968283Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:14.3993532Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:14.4021352Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:14.4043427Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:14.4063519Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:14.4084036Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:14.4104842Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:14.4125549Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:14.4146049Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:14.4175918Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:14.4200111Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:14.4231462Z Entering 'third_party/pocketfft'
2025-12-04T11:11:14.4253438Z Entering 'third_party/protobuf'
2025-12-04T11:11:14.4277660Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:14.4302458Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:14.4331298Z Entering 'third_party/psimd'
2025-12-04T11:11:14.4353374Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:14.4375426Z Entering 'third_party/pybind11'
2025-12-04T11:11:14.4397099Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:14.4418319Z Entering 'third_party/sleef'
2025-12-04T11:11:14.4440177Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:14.4483132Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:14.4484845Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:14.4508070Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:14.4534962Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:14.4558866Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:14.4610400Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
2025-12-04T11:11:14.4630995Z http.https://github.com/.extraheader
2025-12-04T11:11:14.4639822Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader
2025-12-04T11:11:14.4664465Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :"
2025-12-04T11:11:14.4837611Z Entering 'android/libs/fbjni'
2025-12-04T11:11:14.4850692Z http.https://github.com/.extraheader
2025-12-04T11:11:14.4875567Z Entering 'third_party/FP16'
2025-12-04T11:11:14.4890492Z http.https://github.com/.extraheader
2025-12-04T11:11:14.4908196Z Entering 'third_party/FXdiv'
2025-12-04T11:11:14.4932279Z http.https://github.com/.extraheader
2025-12-04T11:11:14.4965924Z Entering 'third_party/NNPACK'
2025-12-04T11:11:14.4979714Z http.https://github.com/.extraheader
2025-12-04T11:11:14.4996626Z Entering 'third_party/NVTX'
2025-12-04T11:11:14.5010539Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5031095Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:14.5052955Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5070606Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:14.5084964Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5114359Z Entering 'third_party/aiter'
2025-12-04T11:11:14.5129253Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5149634Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:14.5169154Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5203059Z Entering 'third_party/benchmark'
2025-12-04T11:11:14.5214610Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5233929Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:14.5257378Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5281340Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:14.5302635Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5321326Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:14.5336695Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5361701Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:14.5382190Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5410186Z Entering 'third_party/cutlass'
2025-12-04T11:11:14.5425624Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5449284Z Entering 'third_party/fbgemm'
2025-12-04T11:11:14.5463623Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5485090Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:14.5499052Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5516706Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:14.5529863Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5549532Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:14.5560502Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5575988Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:14.5589319Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5611371Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:14.5623926Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5640360Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:14.5653085Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5668667Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:14.5680871Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5700318Z Entering 'third_party/flash-attention'
2025-12-04T11:11:14.5727498Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5748806Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:14.5762053Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5790640Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:14.5814483Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5845357Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:14.5861201Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5883233Z Entering 'third_party/fmt'
2025-12-04T11:11:14.5897612Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5916753Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:14.5932380Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5959224Z Entering 'third_party/gloo'
2025-12-04T11:11:14.5973275Z http.https://github.com/.extraheader
2025-12-04T11:11:14.5993051Z Entering 'third_party/googletest'
2025-12-04T11:11:14.6006790Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6027388Z Entering 'third_party/ideep'
2025-12-04T11:11:14.6040625Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6058926Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:14.6072119Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6093474Z Entering 'third_party/ittapi'
2025-12-04T11:11:14.6107312Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6129369Z Entering 'third_party/kineto'
2025-12-04T11:11:14.6142917Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6160579Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:14.6173076Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6188651Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:14.6201432Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6221788Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:14.6235102Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6251250Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:14.6262954Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6278629Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:14.6290596Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6306650Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:14.6318803Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6338805Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:14.6356457Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6375539Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:14.6387477Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6405576Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:14.6419246Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6438011Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:14.6450955Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6468749Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:14.6481472Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6497685Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:14.6510831Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6529886Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:14.6542578Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6562262Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:14.6588419Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6607949Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:14.6627488Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6654113Z Entering 'third_party/kleidiai'
2025-12-04T11:11:14.6671572Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6698254Z Entering 'third_party/mimalloc'
2025-12-04T11:11:14.6715120Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6738855Z Entering 'third_party/nlohmann'
2025-12-04T11:11:14.6759551Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6782747Z Entering 'third_party/onnx'
2025-12-04T11:11:14.6799172Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6832373Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:14.6850101Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6878132Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:14.6899037Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6924440Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:14.6944609Z http.https://github.com/.extraheader
2025-12-04T11:11:14.6968767Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:14.6986684Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7007955Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:14.7024257Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7044627Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:14.7060590Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7080305Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:14.7102674Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7120367Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:14.7134848Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7149819Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:14.7164038Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7180161Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:14.7195284Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7222227Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:14.7235655Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7256290Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:14.7269265Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7293521Z Entering 'third_party/pocketfft'
2025-12-04T11:11:14.7307916Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7326867Z Entering 'third_party/protobuf'
2025-12-04T11:11:14.7347312Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7370725Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:14.7384461Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7405825Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:14.7418543Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7436714Z Entering 'third_party/psimd'
2025-12-04T11:11:14.7454024Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7471504Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:14.7486040Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7503737Z Entering 'third_party/pybind11'
2025-12-04T11:11:14.7518947Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7536514Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:14.7549590Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7567020Z Entering 'third_party/sleef'
2025-12-04T11:11:14.7580564Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7597971Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:14.7612094Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7628844Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:14.7642749Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7659542Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:14.7671062Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7688454Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:14.7700372Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7716977Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:14.7730847Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7754977Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:14.7769214Z http.https://github.com/.extraheader
2025-12-04T11:11:14.7806337Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.7832697Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url
2025-12-04T11:11:14.8001127Z Entering 'android/libs/fbjni'
2025-12-04T11:11:14.8018601Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config	remote.origin.url
2025-12-04T11:11:14.8034568Z Entering 'third_party/FP16'
2025-12-04T11:11:14.8048478Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config	remote.origin.url
2025-12-04T11:11:14.8060191Z Entering 'third_party/FXdiv'
2025-12-04T11:11:14.8074032Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config	remote.origin.url
2025-12-04T11:11:14.8084328Z Entering 'third_party/NNPACK'
2025-12-04T11:11:14.8094478Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config	remote.origin.url
2025-12-04T11:11:14.8105275Z Entering 'third_party/NVTX'
2025-12-04T11:11:14.8117075Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config	remote.origin.url
2025-12-04T11:11:14.8126868Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:14.8137567Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config	remote.origin.url
2025-12-04T11:11:14.8158449Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:14.8169671Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config	remote.origin.url
2025-12-04T11:11:14.8187119Z Entering 'third_party/aiter'
2025-12-04T11:11:14.8199452Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config	remote.origin.url
2025-12-04T11:11:14.8210031Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:14.8216463Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config	remote.origin.url
2025-12-04T11:11:14.8234977Z Entering 'third_party/benchmark'
2025-12-04T11:11:14.8247429Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:14.8258829Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:14.8269847Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config	remote.origin.url
2025-12-04T11:11:14.8286119Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:14.8296548Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config	remote.origin.url
2025-12-04T11:11:14.8315681Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:14.8328010Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config	remote.origin.url
2025-12-04T11:11:14.8338636Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:14.8357533Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config	remote.origin.url
2025-12-04T11:11:14.8376006Z Entering 'third_party/cutlass'
2025-12-04T11:11:14.8386120Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config	remote.origin.url
2025-12-04T11:11:14.8413273Z Entering 'third_party/fbgemm'
2025-12-04T11:11:14.8427021Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config	remote.origin.url
2025-12-04T11:11:14.8436575Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:14.8446226Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config	remote.origin.url
2025-12-04T11:11:14.8454484Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:14.8464811Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config	remote.origin.url
2025-12-04T11:11:14.8476842Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:14.8485157Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config	remote.origin.url
2025-12-04T11:11:14.8492040Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:14.8500414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config	remote.origin.url
2025-12-04T11:11:14.8510214Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:14.8519357Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config	remote.origin.url
2025-12-04T11:11:14.8528296Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:14.8534536Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config	remote.origin.url
2025-12-04T11:11:14.8540828Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:14.8548748Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config	remote.origin.url
2025-12-04T11:11:14.8559145Z Entering 'third_party/flash-attention'
2025-12-04T11:11:14.8568632Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config	remote.origin.url
2025-12-04T11:11:14.8576917Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:14.8598360Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config	remote.origin.url
2025-12-04T11:11:14.8616493Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:14.8631582Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config	remote.origin.url
2025-12-04T11:11:14.8650686Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:14.8666761Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config	remote.origin.url
2025-12-04T11:11:14.8677472Z Entering 'third_party/fmt'
2025-12-04T11:11:14.8688958Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:14.8699325Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:14.8714798Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config	remote.origin.url
2025-12-04T11:11:14.8729348Z Entering 'third_party/gloo'
2025-12-04T11:11:14.8741561Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config	remote.origin.url
2025-12-04T11:11:14.8756765Z Entering 'third_party/googletest'
2025-12-04T11:11:14.8766817Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:14.8776844Z Entering 'third_party/ideep'
2025-12-04T11:11:14.8786460Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config	remote.origin.url
2025-12-04T11:11:14.8798673Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:14.8809228Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config	remote.origin.url
2025-12-04T11:11:14.8821802Z Entering 'third_party/ittapi'
2025-12-04T11:11:14.8833991Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config	remote.origin.url
2025-12-04T11:11:14.8843021Z Entering 'third_party/kineto'
2025-12-04T11:11:14.8852351Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config	remote.origin.url
2025-12-04T11:11:14.8866182Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:14.8873530Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config	remote.origin.url
2025-12-04T11:11:14.8880736Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:14.8913792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config	remote.origin.url
2025-12-04T11:11:14.8924781Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:14.8936163Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config	remote.origin.url
2025-12-04T11:11:14.8942931Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:14.8952862Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:14.8975728Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:14.8986279Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config	remote.origin.url
2025-12-04T11:11:14.8991868Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:14.9003311Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config	remote.origin.url
2025-12-04T11:11:14.9011135Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:14.9020547Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config	remote.origin.url
2025-12-04T11:11:14.9027774Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:14.9036148Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:14.9042713Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:14.9051184Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config	remote.origin.url
2025-12-04T11:11:14.9057684Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:14.9064971Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config	remote.origin.url
2025-12-04T11:11:14.9072834Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:14.9080607Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:14.9088320Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:14.9096815Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:14.9104625Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:14.9113791Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:14.9125165Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:14.9134854Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:14.9142518Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:14.9153270Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:14.9164065Z Entering 'third_party/kleidiai'
2025-12-04T11:11:14.9176013Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config	remote.origin.url
2025-12-04T11:11:14.9187668Z Entering 'third_party/mimalloc'
2025-12-04T11:11:14.9198264Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config	remote.origin.url
2025-12-04T11:11:14.9209716Z Entering 'third_party/nlohmann'
2025-12-04T11:11:14.9220046Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config	remote.origin.url
2025-12-04T11:11:14.9231453Z Entering 'third_party/onnx'
2025-12-04T11:11:14.9241796Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config	remote.origin.url
2025-12-04T11:11:14.9259387Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:14.9268842Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:14.9279555Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:14.9290759Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config	remote.origin.url
2025-12-04T11:11:14.9301251Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:14.9310224Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:14.9317750Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:14.9326613Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:14.9334385Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:14.9352167Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config	remote.origin.url
2025-12-04T11:11:14.9366998Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:14.9379828Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config	remote.origin.url
2025-12-04T11:11:14.9389423Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:14.9398816Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config	remote.origin.url
2025-12-04T11:11:14.9408732Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:14.9418423Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config	remote.origin.url
2025-12-04T11:11:14.9425589Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:14.9436055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:14.9443611Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:14.9452504Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:14.9461183Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:14.9471369Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:14.9481036Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:14.9489833Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config	remote.origin.url
2025-12-04T11:11:14.9509317Z Entering 'third_party/pocketfft'
2025-12-04T11:11:14.9520692Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config	remote.origin.url
2025-12-04T11:11:14.9531972Z Entering 'third_party/protobuf'
2025-12-04T11:11:14.9542558Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config	remote.origin.url
2025-12-04T11:11:14.9553628Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:14.9562981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:14.9570785Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:14.9581639Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:14.9592138Z Entering 'third_party/psimd'
2025-12-04T11:11:14.9602818Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config	remote.origin.url
2025-12-04T11:11:14.9614807Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:14.9624972Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config	remote.origin.url
2025-12-04T11:11:14.9635506Z Entering 'third_party/pybind11'
2025-12-04T11:11:14.9645802Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:14.9656351Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:14.9674724Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config	remote.origin.url
2025-12-04T11:11:14.9685643Z Entering 'third_party/sleef'
2025-12-04T11:11:14.9696882Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config	remote.origin.url
2025-12-04T11:11:14.9707693Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:14.9718904Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config	remote.origin.url
2025-12-04T11:11:14.9729398Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:14.9738942Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:14.9754892Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:14.9772030Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config	remote.origin.url
2025-12-04T11:11:14.9782102Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:14.9791479Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config	remote.origin.url
2025-12-04T11:11:14.9799937Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:14.9815054Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:14.9822949Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:14.9833947Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config	remote.origin.url
2025-12-04T11:11:14.9869143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9895222Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9914101Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9931420Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9947764Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9964436Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9980651Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:14.9997377Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0012485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0028547Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0043334Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0059954Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0083678Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0099923Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0114757Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0130754Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0144183Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0159874Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0176157Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0193922Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0208596Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0239227Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0261106Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0277422Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0296332Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0311701Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0326626Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0341806Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0359110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0374122Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0390301Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0404949Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0420027Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0437482Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0453210Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0479562Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0499396Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0516077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0533156Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0548638Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0564426Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0581317Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0599242Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0614430Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0628909Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0643919Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0658339Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0672813Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0686644Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0701563Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0716112Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0730314Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0744182Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0758573Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0786305Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0814110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0843301Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0872448Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0893509Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0915508Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0935088Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0954509Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0974814Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.0993558Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1011710Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1036493Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1056118Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1076309Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1096270Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1116102Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1135220Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1154690Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1173949Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1193077Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1211051Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1231501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1249108Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1268770Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1288455Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1308027Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1327976Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T11:11:15.1350612Z [command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic ***
2025-12-04T11:11:15.1589412Z ##[endgroup]
2025-12-04T11:11:15.1589870Z ##[group]Fetching the repository
2025-12-04T11:11:15.1594374Z [command]/usr/bin/git -c protocol.version=2 fetch --prune --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/tags/*:refs/tags/*
2025-12-04T11:11:16.8122301Z [command]/usr/bin/git rev-parse --verify --quiet ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32^{object}
2025-12-04T11:11:16.8261687Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:16.8265467Z ##[endgroup]
2025-12-04T11:11:16.8265831Z ##[group]Determining the checkout info
2025-12-04T11:11:16.8266700Z ##[endgroup]
2025-12-04T11:11:16.8271709Z [command]/usr/bin/git sparse-checkout disable
2025-12-04T11:11:16.8405773Z [command]/usr/bin/git config --local --unset-all extensions.worktreeConfig
2025-12-04T11:11:16.8430385Z ##[group]Checking out the ref
2025-12-04T11:11:16.8434309Z [command]/usr/bin/git checkout --progress --force ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:16.8790624Z HEAD is now at ffd9b0fb4355 Resolve collective autotuning test failure on arm (#168919)
2025-12-04T11:11:16.8799138Z ##[endgroup]
2025-12-04T11:11:16.8799532Z ##[group]Setting up auth for fetching submodules
2025-12-04T11:11:16.8805133Z [command]/usr/bin/git config --global http.https://github.com/.extraheader AUTHORIZATION: basic ***
2025-12-04T11:11:16.8851238Z [command]/usr/bin/git config --global --unset-all url.https://github.com/.insteadOf
2025-12-04T11:11:16.8874825Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf git@github.com:
2025-12-04T11:11:16.8903951Z [command]/usr/bin/git config --global --add url.https://github.com/.insteadOf org-21003710@github.com:
2025-12-04T11:11:16.8921919Z ##[endgroup]
2025-12-04T11:11:16.8922163Z ##[group]Fetching submodules
2025-12-04T11:11:16.8924386Z [command]/usr/bin/git submodule sync --recursive
2025-12-04T11:11:16.9112738Z Synchronizing submodule url for 'android/libs/fbjni'
2025-12-04T11:11:16.9123368Z Synchronizing submodule url for 'third_party/FP16'
2025-12-04T11:11:16.9135045Z Synchronizing submodule url for 'third_party/FXdiv'
2025-12-04T11:11:16.9146044Z Synchronizing submodule url for 'third_party/NNPACK'
2025-12-04T11:11:16.9157189Z Synchronizing submodule url for 'third_party/NVTX'
2025-12-04T11:11:16.9168203Z Synchronizing submodule url for 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:16.9179083Z Synchronizing submodule url for 'third_party/XNNPACK'
2025-12-04T11:11:16.9195589Z Synchronizing submodule url for 'third_party/aiter'
2025-12-04T11:11:16.9208636Z Synchronizing submodule url for 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:16.9223178Z Synchronizing submodule url for 'third_party/benchmark'
2025-12-04T11:11:16.9234246Z Synchronizing submodule url for 'third_party/composable_kernel'
2025-12-04T11:11:16.9247424Z Synchronizing submodule url for 'third_party/cpp-httplib'
2025-12-04T11:11:16.9258303Z Synchronizing submodule url for 'third_party/cpuinfo'
2025-12-04T11:11:16.9269772Z Synchronizing submodule url for 'third_party/cudnn_frontend'
2025-12-04T11:11:16.9280627Z Synchronizing submodule url for 'third_party/cutlass'
2025-12-04T11:11:16.9294070Z Synchronizing submodule url for 'third_party/fbgemm'
2025-12-04T11:11:16.9305071Z Synchronizing submodule url for 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:16.9315370Z Synchronizing submodule url for 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:16.9330551Z Synchronizing submodule url for 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:16.9341190Z Synchronizing submodule url for 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:16.9353786Z Synchronizing submodule url for 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:16.9364296Z Synchronizing submodule url for 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:16.9380823Z Synchronizing submodule url for 'third_party/fbgemm/external/json'
2025-12-04T11:11:16.9394172Z Synchronizing submodule url for 'third_party/flash-attention'
2025-12-04T11:11:16.9405796Z Synchronizing submodule url for 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:16.9420263Z Synchronizing submodule url for 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:16.9436234Z Synchronizing submodule url for 'third_party/flatbuffers'
2025-12-04T11:11:16.9449024Z Synchronizing submodule url for 'third_party/fmt'
2025-12-04T11:11:16.9460325Z Synchronizing submodule url for 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:16.9470254Z Synchronizing submodule url for 'third_party/gloo'
2025-12-04T11:11:16.9481126Z Synchronizing submodule url for 'third_party/googletest'
2025-12-04T11:11:16.9490645Z Synchronizing submodule url for 'third_party/ideep'
2025-12-04T11:11:16.9503963Z Synchronizing submodule url for 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:16.9515698Z Synchronizing submodule url for 'third_party/ittapi'
2025-12-04T11:11:16.9526881Z Synchronizing submodule url for 'third_party/kineto'
2025-12-04T11:11:16.9537484Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:16.9551980Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:16.9564413Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:16.9580005Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:16.9591135Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:16.9601885Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:16.9621612Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:16.9639961Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:16.9651583Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:16.9661374Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:16.9670873Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:16.9684324Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:16.9703905Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:16.9719134Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:16.9731317Z Synchronizing submodule url for 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:16.9744070Z Synchronizing submodule url for 'third_party/kleidiai'
2025-12-04T11:11:16.9758616Z Synchronizing submodule url for 'third_party/mimalloc'
2025-12-04T11:11:16.9770048Z Synchronizing submodule url for 'third_party/nlohmann'
2025-12-04T11:11:16.9780532Z Synchronizing submodule url for 'third_party/onnx'
2025-12-04T11:11:16.9799238Z Synchronizing submodule url for 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:16.9816156Z Synchronizing submodule url for 'third_party/opentelemetry-cpp'
2025-12-04T11:11:16.9829072Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:16.9846359Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:16.9857591Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:16.9867719Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:16.9879583Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:16.9890166Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:16.9900122Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:16.9913407Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:16.9923275Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:16.9944190Z Synchronizing submodule url for 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:16.9962197Z Synchronizing submodule url for 'third_party/pocketfft'
2025-12-04T11:11:16.9973302Z Synchronizing submodule url for 'third_party/protobuf'
2025-12-04T11:11:16.9993843Z Synchronizing submodule url for 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:17.0009319Z Synchronizing submodule url for 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:17.0036270Z Synchronizing submodule url for 'third_party/psimd'
2025-12-04T11:11:17.0051559Z Synchronizing submodule url for 'third_party/pthreadpool'
2025-12-04T11:11:17.0064702Z Synchronizing submodule url for 'third_party/pybind11'
2025-12-04T11:11:17.0098132Z Synchronizing submodule url for 'third_party/python-peachpy'
2025-12-04T11:11:17.0124286Z Synchronizing submodule url for 'third_party/sleef'
2025-12-04T11:11:17.0146320Z Synchronizing submodule url for 'third_party/tensorpipe'
2025-12-04T11:11:17.0163494Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:17.0174836Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:17.0184363Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:17.0194113Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:17.0205781Z Synchronizing submodule url for 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:17.0249406Z [command]/usr/bin/git -c protocol.version=2 submodule update --init --force --recursive
2025-12-04T11:11:17.0446090Z Submodule path 'android/libs/fbjni': checked out '7e1e1fe3858c63c251c637ae41a20de425dde96f'
2025-12-04T11:11:17.0498419Z Submodule path 'third_party/FP16': checked out '4dfe081cf6bcd15db339cf2680b9281b8451eeb3'
2025-12-04T11:11:17.0540176Z Submodule path 'third_party/FXdiv': checked out 'b408327ac2a15ec3e43352421954f5b1967701d1'
2025-12-04T11:11:17.0601404Z Submodule path 'third_party/NNPACK': checked out 'c07e3a0400713d546e0dea2d5466dd22ea389c73'
2025-12-04T11:11:17.0682260Z Submodule path 'third_party/NVTX': checked out '3ebbc93ded7285963bff932c678fa367eb393ba6'
2025-12-04T11:11:17.0747482Z Submodule path 'third_party/VulkanMemoryAllocator': checked out '1d8f600fd424278486eade7ed3e877c99f0846b1'
2025-12-04T11:11:17.0904479Z Submodule path 'third_party/XNNPACK': checked out '51a0103656eff6fc9bfd39a4597923c4b542c883'
2025-12-04T11:11:17.1120336Z Submodule path 'third_party/aiter': checked out '01aae101b9e5e94d6c16a9514c9fb8df99c93150'
2025-12-04T11:11:17.1334952Z Submodule path 'third_party/aiter/3rdparty/composable_kernel': checked out 'cffe8fa2a442ac8e80dd236a1a5d24fe3d7e0cbf'
2025-12-04T11:11:17.1411819Z Submodule path 'third_party/benchmark': checked out '299e5928955cc62af9968370293b916f5130916f'
2025-12-04T11:11:17.1623100Z Submodule path 'third_party/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977'
2025-12-04T11:11:17.1700862Z Submodule path 'third_party/cpp-httplib': checked out '89c932f313c6437c38f2982869beacc89c2f2246'
2025-12-04T11:11:17.1755731Z Submodule path 'third_party/cpuinfo': checked out 'f858c30bcb16f8effd5ff46996f0514539e17abc'
2025-12-04T11:11:17.1837004Z Submodule path 'third_party/cudnn_frontend': checked out '0b1577c8c83401237d601d0d0db5210506705396'
2025-12-04T11:11:17.1942159Z Submodule path 'third_party/cutlass': checked out 'f88806b1e31dfa579842638740216dd41fc6c588'
2025-12-04T11:11:17.2070666Z Submodule path 'third_party/fbgemm': checked out 'c0b988d39a9e47c794d699f29930ed4d7c7e13a4'
2025-12-04T11:11:17.2128211Z Submodule path 'third_party/fbgemm/external/asmjit': checked out 'a3199e8857792cd10b7589ff5d58343d2c9008ea'
2025-12-04T11:11:17.2398448Z Submodule path 'third_party/fbgemm/external/composable_kernel': checked out '7fe50dc3da2069d6645d9deb8c017a876472a977'
2025-12-04T11:11:17.2469046Z Submodule path 'third_party/fbgemm/external/cpuinfo': checked out '6543fec09b2f04ac4a666882998b534afc9c1349'
2025-12-04T11:11:17.2605804Z Submodule path 'third_party/fbgemm/external/cutlass': checked out '98125ce499b0fdf7ffbe0e3052f5b8709f4840f8'
2025-12-04T11:11:17.2683270Z Submodule path 'third_party/fbgemm/external/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:17.2738409Z Submodule path 'third_party/fbgemm/external/hipify_torch': checked out '63b6a7b541fa7f08f8475ca7d74054db36ff2691'
2025-12-04T11:11:17.2840116Z Submodule path 'third_party/fbgemm/external/json': checked out '9cca280a4d0ccf0c08f47a99aa71d1b0e52f8d03'
2025-12-04T11:11:17.2921333Z Submodule path 'third_party/flash-attention': checked out '979702c87a8713a8e0a5e9fee122b90d2ef13be5'
2025-12-04T11:11:17.3101252Z Submodule path 'third_party/flash-attention/csrc/composable_kernel': checked out '888317e698e9803c62bd38568abc9e05d7709f33'
2025-12-04T11:11:17.3220209Z Submodule path 'third_party/flash-attention/csrc/cutlass': checked out 'c506e16788cb08416a4a57e11a9067beeee29420'
2025-12-04T11:11:17.3316905Z Submodule path 'third_party/flatbuffers': checked out 'a2cd1ea3b6d3fee220106b5fed3f7ce8da9eb757'
2025-12-04T11:11:17.3371620Z Submodule path 'third_party/fmt': checked out '407c905e45ad75fc29bf0f9bb7c5c2fd3475976f'
2025-12-04T11:11:17.3439931Z Submodule path 'third_party/gemmlowp/gemmlowp': checked out '3fb5c176c17c765a3492cd2f0321b0dab712f350'
2025-12-04T11:11:17.3497399Z Submodule path 'third_party/gloo': checked out '54cbae0d3a67fa890b4c3d9ee162b7860315e341'
2025-12-04T11:11:17.3556862Z Submodule path 'third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:17.3614962Z Submodule path 'third_party/ideep': checked out '719d8e6cd7f7a0e01b155657526d693acf97c2b3'
2025-12-04T11:11:17.3826853Z Submodule path 'third_party/ideep/mkl-dnn': checked out '8d263e693366ef8db40acc569cc7d8edf644556d'
2025-12-04T11:11:17.3885630Z Submodule path 'third_party/ittapi': checked out 'dec1d23ca65ab069d225dfe40dea14f455170959'
2025-12-04T11:11:17.3945555Z Submodule path 'third_party/kineto': checked out '31f85df8fbd89c188f14ef10f1ec65379786b943'
2025-12-04T11:11:17.4034069Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog': checked out 'd2ffe0a4e3acace628db49974246b66fc3e85fb1'
2025-12-04T11:11:17.4107214Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM': checked out 'ffde4e54bc7249a6039a5e6b45b395141e1217f9'
2025-12-04T11:11:17.4162040Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr': checked out '871ed52d350214a034f6ef8a3b8f51c5ce1bd400'
2025-12-04T11:11:17.4209919Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt': checked out 'cd4af11efc9c622896a3e4cb599fa28668ca3d05'
2025-12-04T11:11:17.4262248Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags': checked out 'e171aa2d15ed9eb17054558e0b3a6a413bb01067'
2025-12-04T11:11:17.4331533Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc': checked out '8411df715cf522606e3b1aca386ddfc0b63d34b4'
2025-12-04T11:11:17.4394188Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog': checked out 'b33e3bad4c46c8a6345525fd822af355e5ef9446'
2025-12-04T11:11:17.4444273Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:17.4531024Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/json': checked out '4f8fba14066156b73f1189a2b8bd568bde5284c5'
2025-12-04T11:11:17.4574072Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs': checked out 'f68a2fa8ea36c783bdd760371411fcb495aa3150'
2025-12-04T11:11:17.4640595Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp': checked out 'b1234816facfdda29845c46696a02998a4af115a'
2025-12-04T11:11:17.4730636Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'd7ba35bbb649209c66e582d5a0244ba988a15159'
2025-12-04T11:11:17.4799385Z Submodule path 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
2025-12-04T11:11:17.4864213Z Submodule path 'third_party/kineto/libkineto/third_party/fmt': checked out '40626af88bd7df9a5fb80be7b25ac85b122d6c21'
2025-12-04T11:11:17.4919032Z Submodule path 'third_party/kineto/libkineto/third_party/googletest': checked out '52eb8108c5bdec04579160ae17225d66034bd723'
2025-12-04T11:11:17.4997768Z Submodule path 'third_party/kleidiai': checked out 'd7770c89632329a9914ef1a90289917597639cbe'
2025-12-04T11:11:17.5074210Z Submodule path 'third_party/mimalloc': checked out 'fbd8b99c2b828428947d70fdc046bb55609be93e'
2025-12-04T11:11:17.5165049Z Submodule path 'third_party/nlohmann': checked out '55f93686c01528224f448c19128836e7df245f72'
2025-12-04T11:11:17.5335050Z Submodule path 'third_party/onnx': checked out 'e709452ef2bbc1d113faf678c24e6d3467696e83'
2025-12-04T11:11:17.5411182Z Submodule path 'third_party/onnx/third_party/pybind11': checked out 'a2e59f0e7065404b44dfe92a28aca47ba1378dc4'
2025-12-04T11:11:17.5497851Z Submodule path 'third_party/opentelemetry-cpp': checked out 'a799f4aed9c94b765dcdaabaeab7d5e7e2310878'
2025-12-04T11:11:17.5551256Z Submodule path 'third_party/opentelemetry-cpp/third_party/benchmark': checked out 'd572f4777349d43653b21d6c2fc63020ab326db2'
2025-12-04T11:11:17.5602464Z Submodule path 'third_party/opentelemetry-cpp/third_party/googletest': checked out 'b796f7d44681514f58a683a3a71ff17c94edb0c1'
2025-12-04T11:11:17.5650248Z Submodule path 'third_party/opentelemetry-cpp/third_party/ms-gsl': checked out '6f4529395c5b7c2d661812257cd6780c67e54afa'
2025-12-04T11:11:17.5730547Z Submodule path 'third_party/opentelemetry-cpp/third_party/nlohmann-json': checked out 'bc889afb4c5bf1c0d8ee29ef35eaaf4c8bef8a5d'
2025-12-04T11:11:17.5787472Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto': checked out '4ca4f0335c63cda7ab31ea7ed70d6553aee14dce'
2025-12-04T11:11:17.5833799Z Submodule path 'third_party/opentelemetry-cpp/third_party/opentracing-cpp': checked out '06b57f48ded1fa3bdd3d4346f6ef29e40e08eaf5'
2025-12-04T11:11:17.5903961Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp': checked out 'c9ffcdda9086ffd9e1283ea7a0276d831f3c8a8d'
2025-12-04T11:11:17.5983584Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb': checked out 'eefb26f82b233268fc98577d265352720d477ba4'
2025-12-04T11:11:17.6072711Z Submodule path 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest': checked out 'e2239ee6043f73722e7aa812a459f54a28552929'
2025-12-04T11:11:17.6229919Z Submodule path 'third_party/opentelemetry-cpp/tools/vcpkg': checked out '8eb57355a4ffb410a2e94c07b4dca2dffbee8e50'
2025-12-04T11:11:17.6309437Z Submodule path 'third_party/pocketfft': checked out '0fa0ef591e38c2758e3184c6c23e497b9f732ffa'
2025-12-04T11:11:17.6505068Z Submodule path 'third_party/protobuf': checked out 'd1eca4e4b421cd2997495c4b4e65cea6be4e9b8a'
2025-12-04T11:11:17.6595768Z Submodule path 'third_party/protobuf/third_party/benchmark': checked out '5b7683f49e1e9223cf9927b24f6fd3d6bd82e3f8'
2025-12-04T11:11:17.6653294Z Submodule path 'third_party/protobuf/third_party/googletest': checked out '5ec7f0c4a113e2f18ac2c6cc7df51ad6afc24081'
2025-12-04T11:11:17.6735389Z Submodule path 'third_party/psimd': checked out '072586a71b55b7f8c584153d223e95687148a900'
2025-12-04T11:11:17.6792760Z Submodule path 'third_party/pthreadpool': checked out '4fe0e1e183925bf8cfa6aae24237e724a96479b8'
2025-12-04T11:11:17.6869870Z Submodule path 'third_party/pybind11': checked out 'f5fbe867d2d26e4a0a9177a51f6e568868ad3dc8'
2025-12-04T11:11:17.6919963Z Submodule path 'third_party/python-peachpy': checked out 'f45429b087dd7d5bc78bb40dc7cf06425c252d67'
2025-12-04T11:11:17.6982493Z Submodule path 'third_party/sleef': checked out '5a1d179df9cf652951b59010a2d2075372d67f68'
2025-12-04T11:11:17.7053589Z Submodule path 'third_party/tensorpipe': checked out '2b4cd91092d335a697416b2a3cb398283246849d'
2025-12-04T11:11:17.7121476Z Submodule path 'third_party/tensorpipe/third_party/googletest': checked out 'aee0f9d9b5b87796ee8a0ab26b7587ec30e8858e'
2025-12-04T11:11:17.7180909Z Submodule path 'third_party/tensorpipe/third_party/libnop': checked out '910b55815be16109f04f4180e9adee14fb4ce281'
2025-12-04T11:11:17.7358508Z Submodule path 'third_party/tensorpipe/third_party/libuv': checked out '5152db2cbfeb5582e9c27c5ea1dba2cd9e10759b'
2025-12-04T11:11:17.7449933Z Submodule path 'third_party/tensorpipe/third_party/pybind11': checked out 'a23996fce38ff6ccfbcdc09f1e63f2c4be5ea2ef'
2025-12-04T11:11:17.7500849Z Submodule path 'third_party/tensorpipe/third_party/pybind11/tools/clang': checked out '6a00cbc4a9b8e68b71caf7f774b3f9c753ae84d5'
2025-12-04T11:11:17.7549160Z [command]/usr/bin/git submodule foreach --recursive git config --local gc.auto 0
2025-12-04T11:11:17.7715631Z Entering 'android/libs/fbjni'
2025-12-04T11:11:17.7739792Z Entering 'third_party/FP16'
2025-12-04T11:11:17.7764717Z Entering 'third_party/FXdiv'
2025-12-04T11:11:17.7788905Z Entering 'third_party/NNPACK'
2025-12-04T11:11:17.7812711Z Entering 'third_party/NVTX'
2025-12-04T11:11:17.7853024Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:17.7895087Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:17.7934401Z Entering 'third_party/aiter'
2025-12-04T11:11:17.7980568Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:17.8018426Z Entering 'third_party/benchmark'
2025-12-04T11:11:17.8047639Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:17.8093673Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:17.8120128Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:17.8155152Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:17.8188661Z Entering 'third_party/cutlass'
2025-12-04T11:11:17.8217286Z Entering 'third_party/fbgemm'
2025-12-04T11:11:17.8253623Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:17.8277423Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:17.8321120Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:17.8353180Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:17.8388671Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:17.8418990Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:17.8444743Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:17.8476056Z Entering 'third_party/flash-attention'
2025-12-04T11:11:17.8503702Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:17.8531164Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:17.8556057Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:17.8591339Z Entering 'third_party/fmt'
2025-12-04T11:11:17.8614565Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:17.8635780Z Entering 'third_party/gloo'
2025-12-04T11:11:17.8657356Z Entering 'third_party/googletest'
2025-12-04T11:11:17.8677476Z Entering 'third_party/ideep'
2025-12-04T11:11:17.8699962Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:17.8722925Z Entering 'third_party/ittapi'
2025-12-04T11:11:17.8742502Z Entering 'third_party/kineto'
2025-12-04T11:11:17.8761636Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:17.8795534Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:17.8822374Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:17.8844019Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:17.8871759Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:17.8917858Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:17.8965204Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:17.9002630Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:17.9025885Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:17.9063167Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:17.9085280Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:17.9111681Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:17.9156684Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:17.9192886Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:17.9221031Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:17.9256929Z Entering 'third_party/kleidiai'
2025-12-04T11:11:17.9286519Z Entering 'third_party/mimalloc'
2025-12-04T11:11:17.9315266Z Entering 'third_party/nlohmann'
2025-12-04T11:11:17.9336805Z Entering 'third_party/onnx'
2025-12-04T11:11:17.9381534Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:17.9426752Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:17.9458228Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:17.9493572Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:17.9523706Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:17.9557137Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:17.9580831Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:17.9618902Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:17.9647628Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:17.9675181Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:17.9704080Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:17.9736664Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:17.9773519Z Entering 'third_party/pocketfft'
2025-12-04T11:11:17.9803352Z Entering 'third_party/protobuf'
2025-12-04T11:11:17.9827019Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:17.9849918Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:17.9871808Z Entering 'third_party/psimd'
2025-12-04T11:11:17.9900134Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:17.9921701Z Entering 'third_party/pybind11'
2025-12-04T11:11:17.9947223Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:17.9967181Z Entering 'third_party/sleef'
2025-12-04T11:11:17.9988893Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:18.0010067Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:18.0029240Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:18.0047837Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:18.0067194Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:18.0089331Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:18.0122308Z ##[endgroup]
2025-12-04T11:11:18.0122576Z ##[group]Persisting credentials for submodules
2025-12-04T11:11:18.0129225Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'url\.https\:\/\/github\.com\/\.insteadOf' && git config --local --unset-all 'url.https://github.com/.insteadOf' || :"
2025-12-04T11:11:18.0280292Z Entering 'android/libs/fbjni'
2025-12-04T11:11:18.0294138Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0294331Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0311481Z Entering 'third_party/FP16'
2025-12-04T11:11:18.0323551Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0323739Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0338392Z Entering 'third_party/FXdiv'
2025-12-04T11:11:18.0352090Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0352294Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0367793Z Entering 'third_party/NNPACK'
2025-12-04T11:11:18.0380615Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0380804Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0397669Z Entering 'third_party/NVTX'
2025-12-04T11:11:18.0409886Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0410079Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0425275Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:18.0441989Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0442127Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0466881Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:18.0480358Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0480644Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0513444Z Entering 'third_party/aiter'
2025-12-04T11:11:18.0526430Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0526571Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0557919Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:18.0570910Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0571042Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0600655Z Entering 'third_party/benchmark'
2025-12-04T11:11:18.0614582Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0614811Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0634189Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:18.0647488Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0647628Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0671963Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:18.0684164Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0684285Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0704687Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:18.0723021Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0723260Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0742053Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:18.0763451Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0763587Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0787557Z Entering 'third_party/cutlass'
2025-12-04T11:11:18.0802144Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0802280Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0827734Z Entering 'third_party/fbgemm'
2025-12-04T11:11:18.0842217Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0842568Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0861464Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:18.0875624Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0875754Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0901165Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:18.0915381Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0915510Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0939760Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:18.0953610Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0953744Z url.https://github.com/.insteadof
2025-12-04T11:11:18.0973618Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:18.0988913Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1007646Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1007790Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:18.1020723Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1020862Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1043282Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:18.1057103Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1057256Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1073833Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:18.1086456Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1086587Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1106750Z Entering 'third_party/flash-attention'
2025-12-04T11:11:18.1122258Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1122375Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1143583Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:18.1157657Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1158058Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1177210Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:18.1193711Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1193858Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1216472Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:18.1234551Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1234694Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1254199Z Entering 'third_party/fmt'
2025-12-04T11:11:18.1268010Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1268217Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1285201Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:18.1298936Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1299071Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1316848Z Entering 'third_party/gloo'
2025-12-04T11:11:18.1331918Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1332065Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1348964Z Entering 'third_party/googletest'
2025-12-04T11:11:18.1363654Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1363800Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1381260Z Entering 'third_party/ideep'
2025-12-04T11:11:18.1395444Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1395596Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1412373Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:18.1425909Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1426061Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1448355Z Entering 'third_party/ittapi'
2025-12-04T11:11:18.1462608Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1462759Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1480848Z Entering 'third_party/kineto'
2025-12-04T11:11:18.1494492Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1494637Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1512868Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:18.1529377Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1529764Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1549452Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:18.1563290Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1563540Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1582340Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:18.1597380Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1597528Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1615833Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:18.1629947Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1630095Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1648419Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:18.1663013Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1663147Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1682394Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:18.1694927Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1695061Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1727589Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:18.1746409Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1746561Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1763669Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:18.1784012Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1784224Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1805569Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:18.1822301Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1822450Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1840203Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:18.1855737Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1855895Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1873293Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:18.1886321Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1886475Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1904028Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:18.1918951Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1940698Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1940950Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:18.1957446Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1957600Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1978671Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:18.1993362Z url.https://github.com/.insteadof
2025-12-04T11:11:18.1993513Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2013552Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:18.2027043Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2027196Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2047827Z Entering 'third_party/kleidiai'
2025-12-04T11:11:18.2063114Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2063266Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2081331Z Entering 'third_party/mimalloc'
2025-12-04T11:11:18.2095839Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2095998Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2113830Z Entering 'third_party/nlohmann'
2025-12-04T11:11:18.2128025Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2128222Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2146897Z Entering 'third_party/onnx'
2025-12-04T11:11:18.2161444Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2161598Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2186034Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:18.2199766Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2199923Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2220842Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:18.2234574Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2234728Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2261529Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:18.2274832Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2274981Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2294810Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:18.2310196Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2310345Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2331374Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:18.2345646Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2345812Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2364411Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:18.2385921Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2386073Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2403083Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:18.2420882Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2421027Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2439141Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:18.2454121Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2454271Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2476833Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:18.2493385Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2493532Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2516608Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:18.2534881Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2535119Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2555129Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:18.2570016Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2570163Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2590639Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:18.2609244Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2609403Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2635781Z Entering 'third_party/pocketfft'
2025-12-04T11:11:18.2650842Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2650994Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2668002Z Entering 'third_party/protobuf'
2025-12-04T11:11:18.2683188Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2683335Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2703291Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:18.2717490Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2717644Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2736377Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:18.2749924Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2750080Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2771214Z Entering 'third_party/psimd'
2025-12-04T11:11:18.2790239Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2790389Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2809261Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:18.2824179Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2824329Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2843018Z Entering 'third_party/pybind11'
2025-12-04T11:11:18.2857633Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2857899Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2875980Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:18.2890490Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2890810Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2908349Z Entering 'third_party/sleef'
2025-12-04T11:11:18.2924061Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2924372Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2942024Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:18.2956659Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2956846Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2973993Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:18.2987837Z url.https://github.com/.insteadof
2025-12-04T11:11:18.2987969Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3008063Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:18.3021759Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3021881Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3039500Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:18.3053842Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3053975Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3073161Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:18.3089793Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3090086Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3107388Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:18.3124034Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3124173Z url.https://github.com/.insteadof
2025-12-04T11:11:18.3161595Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local 'http.https://github.com/.extraheader' 'AUTHORIZATION: basic ***' && git config --local --show-origin --name-only --get-regexp remote.origin.url"
2025-12-04T11:11:18.3329477Z Entering 'android/libs/fbjni'
2025-12-04T11:11:18.3356046Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config	remote.origin.url
2025-12-04T11:11:18.3367573Z Entering 'third_party/FP16'
2025-12-04T11:11:18.3391774Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config	remote.origin.url
2025-12-04T11:11:18.3402952Z Entering 'third_party/FXdiv'
2025-12-04T11:11:18.3426956Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config	remote.origin.url
2025-12-04T11:11:18.3437917Z Entering 'third_party/NNPACK'
2025-12-04T11:11:18.3460416Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config	remote.origin.url
2025-12-04T11:11:18.3473899Z Entering 'third_party/NVTX'
2025-12-04T11:11:18.3498429Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config	remote.origin.url
2025-12-04T11:11:18.3510143Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:18.3533969Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config	remote.origin.url
2025-12-04T11:11:18.3545117Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:18.3567191Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config	remote.origin.url
2025-12-04T11:11:18.3584034Z Entering 'third_party/aiter'
2025-12-04T11:11:18.3605887Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config	remote.origin.url
2025-12-04T11:11:18.3617625Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:18.3639285Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config	remote.origin.url
2025-12-04T11:11:18.3654705Z Entering 'third_party/benchmark'
2025-12-04T11:11:18.3676950Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:18.3689377Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:18.3711584Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config	remote.origin.url
2025-12-04T11:11:18.3725782Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:18.3747944Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config	remote.origin.url
2025-12-04T11:11:18.3759249Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:18.3783109Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config	remote.origin.url
2025-12-04T11:11:18.3794236Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:18.3815525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config	remote.origin.url
2025-12-04T11:11:18.3826507Z Entering 'third_party/cutlass'
2025-12-04T11:11:18.3848492Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config	remote.origin.url
2025-12-04T11:11:18.3863458Z Entering 'third_party/fbgemm'
2025-12-04T11:11:18.3884889Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config	remote.origin.url
2025-12-04T11:11:18.3897945Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:18.3927850Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config	remote.origin.url
2025-12-04T11:11:18.3937936Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:18.3959586Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config	remote.origin.url
2025-12-04T11:11:18.3972608Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:18.3994720Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config	remote.origin.url
2025-12-04T11:11:18.4005008Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:18.4025894Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config	remote.origin.url
2025-12-04T11:11:18.4040202Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:18.4061125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config	remote.origin.url
2025-12-04T11:11:18.4071108Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:18.4091916Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config	remote.origin.url
2025-12-04T11:11:18.4101638Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:18.4123138Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config	remote.origin.url
2025-12-04T11:11:18.4135300Z Entering 'third_party/flash-attention'
2025-12-04T11:11:18.4158843Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config	remote.origin.url
2025-12-04T11:11:18.4170559Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:18.4192197Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config	remote.origin.url
2025-12-04T11:11:18.4204940Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:18.4231989Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config	remote.origin.url
2025-12-04T11:11:18.4247340Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:18.4271030Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config	remote.origin.url
2025-12-04T11:11:18.4283615Z Entering 'third_party/fmt'
2025-12-04T11:11:18.4304955Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:18.4317487Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:18.4339766Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config	remote.origin.url
2025-12-04T11:11:18.4351356Z Entering 'third_party/gloo'
2025-12-04T11:11:18.4373003Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config	remote.origin.url
2025-12-04T11:11:18.4384016Z Entering 'third_party/googletest'
2025-12-04T11:11:18.4404989Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:18.4416795Z Entering 'third_party/ideep'
2025-12-04T11:11:18.4438472Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config	remote.origin.url
2025-12-04T11:11:18.4448826Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:18.4469707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config	remote.origin.url
2025-12-04T11:11:18.4484408Z Entering 'third_party/ittapi'
2025-12-04T11:11:18.4505949Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config	remote.origin.url
2025-12-04T11:11:18.4516915Z Entering 'third_party/kineto'
2025-12-04T11:11:18.4538614Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config	remote.origin.url
2025-12-04T11:11:18.4549611Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:18.4570271Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config	remote.origin.url
2025-12-04T11:11:18.4582824Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:18.4606860Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config	remote.origin.url
2025-12-04T11:11:18.4618317Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:18.4640439Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config	remote.origin.url
2025-12-04T11:11:18.4650586Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:18.4683981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:18.4694445Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:18.4719324Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config	remote.origin.url
2025-12-04T11:11:18.4729017Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:18.4750169Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config	remote.origin.url
2025-12-04T11:11:18.4762022Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:18.4784115Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config	remote.origin.url
2025-12-04T11:11:18.4794221Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:18.4818088Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:18.4827959Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:18.4850899Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config	remote.origin.url
2025-12-04T11:11:18.4861159Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:18.4883620Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config	remote.origin.url
2025-12-04T11:11:18.4892484Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:18.4918441Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:18.4932947Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:18.4955531Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:18.4966386Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:18.4992940Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:18.5008338Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:18.5031430Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config	remote.origin.url
2025-12-04T11:11:18.5041558Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:18.5063810Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:18.5075377Z Entering 'third_party/kleidiai'
2025-12-04T11:11:18.5096373Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config	remote.origin.url
2025-12-04T11:11:18.5108747Z Entering 'third_party/mimalloc'
2025-12-04T11:11:18.5131600Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config	remote.origin.url
2025-12-04T11:11:18.5144521Z Entering 'third_party/nlohmann'
2025-12-04T11:11:18.5167792Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config	remote.origin.url
2025-12-04T11:11:18.5180055Z Entering 'third_party/onnx'
2025-12-04T11:11:18.5204562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config	remote.origin.url
2025-12-04T11:11:18.5220785Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:18.5246282Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:18.5260374Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:18.5281502Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config	remote.origin.url
2025-12-04T11:11:18.5296847Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:18.5322869Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:18.5333753Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:18.5358883Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:18.5368899Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:18.5397652Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config	remote.origin.url
2025-12-04T11:11:18.5408229Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:18.5433964Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config	remote.origin.url
2025-12-04T11:11:18.5444718Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:18.5481076Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config	remote.origin.url
2025-12-04T11:11:18.5494816Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:18.5517435Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config	remote.origin.url
2025-12-04T11:11:18.5528537Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:18.5555915Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T11:11:18.5567658Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:18.5590297Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T11:11:18.5604198Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:18.5626197Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T11:11:18.5638052Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:18.5661019Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config	remote.origin.url
2025-12-04T11:11:18.5684455Z Entering 'third_party/pocketfft'
2025-12-04T11:11:18.5710997Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config	remote.origin.url
2025-12-04T11:11:18.5722966Z Entering 'third_party/protobuf'
2025-12-04T11:11:18.5749864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config	remote.origin.url
2025-12-04T11:11:18.5762818Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:18.5797474Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T11:11:18.5809189Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:18.5830525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:18.5844518Z Entering 'third_party/psimd'
2025-12-04T11:11:18.5867348Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config	remote.origin.url
2025-12-04T11:11:18.5879222Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:18.5901830Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config	remote.origin.url
2025-12-04T11:11:18.5913134Z Entering 'third_party/pybind11'
2025-12-04T11:11:18.5936774Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:18.5948551Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:18.5972288Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config	remote.origin.url
2025-12-04T11:11:18.5983095Z Entering 'third_party/sleef'
2025-12-04T11:11:18.6005715Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config	remote.origin.url
2025-12-04T11:11:18.6016646Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:18.6040914Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config	remote.origin.url
2025-12-04T11:11:18.6051887Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:18.6075752Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config	remote.origin.url
2025-12-04T11:11:18.6094063Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:18.6121262Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config	remote.origin.url
2025-12-04T11:11:18.6132550Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:18.6167801Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config	remote.origin.url
2025-12-04T11:11:18.6179168Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:18.6200213Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T11:11:18.6212623Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:18.6234320Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config	remote.origin.url
2025-12-04T11:11:18.6512381Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'git@github.com:'
2025-12-04T11:11:18.6692760Z Entering 'android/libs/fbjni'
2025-12-04T11:11:18.6719344Z Entering 'third_party/FP16'
2025-12-04T11:11:18.6748280Z Entering 'third_party/FXdiv'
2025-12-04T11:11:18.6770224Z Entering 'third_party/NNPACK'
2025-12-04T11:11:18.6796619Z Entering 'third_party/NVTX'
2025-12-04T11:11:18.6820880Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:18.6844876Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:18.6873889Z Entering 'third_party/aiter'
2025-12-04T11:11:18.6909912Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:18.6944341Z Entering 'third_party/benchmark'
2025-12-04T11:11:18.6971789Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:18.7000746Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:18.7031106Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:18.7061511Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:18.7085870Z Entering 'third_party/cutlass'
2025-12-04T11:11:18.7116013Z Entering 'third_party/fbgemm'
2025-12-04T11:11:18.7138999Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:18.7159007Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:18.7189467Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:18.7210634Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:18.7238073Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:18.7265333Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:18.7300481Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:18.7326644Z Entering 'third_party/flash-attention'
2025-12-04T11:11:18.7353501Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:18.7380445Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:18.7413156Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:18.7442889Z Entering 'third_party/fmt'
2025-12-04T11:11:18.7465134Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:18.7488357Z Entering 'third_party/gloo'
2025-12-04T11:11:18.7515103Z Entering 'third_party/googletest'
2025-12-04T11:11:18.7537353Z Entering 'third_party/ideep'
2025-12-04T11:11:18.7560963Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:18.7582942Z Entering 'third_party/ittapi'
2025-12-04T11:11:18.7608031Z Entering 'third_party/kineto'
2025-12-04T11:11:18.7631961Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:18.7653855Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:18.7677698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:18.7705841Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:18.7735247Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:18.7758324Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:18.7786242Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:18.7807630Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:18.7839613Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:18.7868799Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:18.7890209Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:18.7914283Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:18.7937725Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:18.7970726Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:18.7997740Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:18.8033328Z Entering 'third_party/kleidiai'
2025-12-04T11:11:18.8059532Z Entering 'third_party/mimalloc'
2025-12-04T11:11:18.8082410Z Entering 'third_party/nlohmann'
2025-12-04T11:11:18.8106057Z Entering 'third_party/onnx'
2025-12-04T11:11:18.8135445Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:18.8159820Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:18.8185339Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:18.8208465Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:18.8237944Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:18.8269636Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:18.8296200Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:18.8329529Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:18.8352089Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:18.8378965Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:18.8401357Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:18.8434159Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:18.8467090Z Entering 'third_party/pocketfft'
2025-12-04T11:11:18.8493139Z Entering 'third_party/protobuf'
2025-12-04T11:11:18.8525729Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:18.8557504Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:18.8590230Z Entering 'third_party/psimd'
2025-12-04T11:11:18.8615061Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:18.8638463Z Entering 'third_party/pybind11'
2025-12-04T11:11:18.8669071Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:18.8693495Z Entering 'third_party/sleef'
2025-12-04T11:11:18.8722366Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:18.8745592Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:18.8771782Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:18.8795824Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:18.8828468Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:18.8853217Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:18.8893587Z [command]/usr/bin/git submodule foreach --recursive git config --local --add 'url.https://github.com/.insteadOf' 'org-21003710@github.com:'
2025-12-04T11:11:18.9079833Z Entering 'android/libs/fbjni'
2025-12-04T11:11:18.9100196Z Entering 'third_party/FP16'
2025-12-04T11:11:18.9130082Z Entering 'third_party/FXdiv'
2025-12-04T11:11:18.9153125Z Entering 'third_party/NNPACK'
2025-12-04T11:11:18.9183262Z Entering 'third_party/NVTX'
2025-12-04T11:11:18.9207956Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T11:11:18.9228501Z Entering 'third_party/XNNPACK'
2025-12-04T11:11:18.9257388Z Entering 'third_party/aiter'
2025-12-04T11:11:18.9283263Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T11:11:18.9314470Z Entering 'third_party/benchmark'
2025-12-04T11:11:18.9339347Z Entering 'third_party/composable_kernel'
2025-12-04T11:11:18.9367004Z Entering 'third_party/cpp-httplib'
2025-12-04T11:11:18.9398439Z Entering 'third_party/cpuinfo'
2025-12-04T11:11:18.9424354Z Entering 'third_party/cudnn_frontend'
2025-12-04T11:11:18.9451229Z Entering 'third_party/cutlass'
2025-12-04T11:11:18.9479376Z Entering 'third_party/fbgemm'
2025-12-04T11:11:18.9504768Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T11:11:18.9531336Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T11:11:18.9564271Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T11:11:18.9591660Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T11:11:18.9618475Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T11:11:18.9641537Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T11:11:18.9662630Z Entering 'third_party/fbgemm/external/json'
2025-12-04T11:11:18.9693235Z Entering 'third_party/flash-attention'
2025-12-04T11:11:18.9715669Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T11:11:18.9740958Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T11:11:18.9773025Z Entering 'third_party/flatbuffers'
2025-12-04T11:11:18.9796578Z Entering 'third_party/fmt'
2025-12-04T11:11:18.9818345Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T11:11:18.9840553Z Entering 'third_party/gloo'
2025-12-04T11:11:18.9863698Z Entering 'third_party/googletest'
2025-12-04T11:11:18.9885563Z Entering 'third_party/ideep'
2025-12-04T11:11:18.9908566Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T11:11:18.9938227Z Entering 'third_party/ittapi'
2025-12-04T11:11:18.9969783Z Entering 'third_party/kineto'
2025-12-04T11:11:18.9993244Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T11:11:19.0018418Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T11:11:19.0044739Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T11:11:19.0072654Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T11:11:19.0098636Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T11:11:19.0123128Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T11:11:19.0146698Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T11:11:19.0167558Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T11:11:19.0188788Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T11:11:19.0208917Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T11:11:19.0228392Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T11:11:19.0254563Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:19.0280847Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:19.0307452Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T11:11:19.0326795Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T11:11:19.0356767Z Entering 'third_party/kleidiai'
2025-12-04T11:11:19.0378906Z Entering 'third_party/mimalloc'
2025-12-04T11:11:19.0409234Z Entering 'third_party/nlohmann'
2025-12-04T11:11:19.0438555Z Entering 'third_party/onnx'
2025-12-04T11:11:19.0475787Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T11:11:19.0499864Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T11:11:19.0520245Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T11:11:19.0540118Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T11:11:19.0567079Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T11:11:19.0594604Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T11:11:19.0617980Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T11:11:19.0646662Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T11:11:19.0671030Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T11:11:19.0698257Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T11:11:19.0727528Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T11:11:19.0758627Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T11:11:19.0799176Z Entering 'third_party/pocketfft'
2025-12-04T11:11:19.0824489Z Entering 'third_party/protobuf'
2025-12-04T11:11:19.0852695Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T11:11:19.0875438Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T11:11:19.0903160Z Entering 'third_party/psimd'
2025-12-04T11:11:19.0930948Z Entering 'third_party/pthreadpool'
2025-12-04T11:11:19.0962089Z Entering 'third_party/pybind11'
2025-12-04T11:11:19.0986811Z Entering 'third_party/python-peachpy'
2025-12-04T11:11:19.1006874Z Entering 'third_party/sleef'
2025-12-04T11:11:19.1029491Z Entering 'third_party/tensorpipe'
2025-12-04T11:11:19.1051913Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T11:11:19.1073092Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T11:11:19.1095104Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T11:11:19.1122488Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T11:11:19.1147989Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T11:11:19.1182175Z ##[endgroup]
2025-12-04T11:11:19.1465746Z [command]/usr/bin/git log -1 --format=%H
2025-12-04T11:11:19.1684611Z ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:19.1856409Z Prepare all required actions
2025-12-04T11:11:19.1856725Z Getting action download info
2025-12-04T11:11:19.4634623Z Download action repository 'aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076' (SHA:062b18b96a7aff071d4dc91bc00c4c1a7945b076)
2025-12-04T11:11:20.2008121Z ##[group]Run ./.github/actions/setup-rocm
2025-12-04T11:11:20.2008320Z env:
2025-12-04T11:11:20.2008412Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.2008516Z ##[endgroup]
2025-12-04T11:11:20.2019620Z ##[group]Run dpkg -l | grep -E "  rocm"
2025-12-04T11:11:20.2019757Z [36;1mdpkg -l | grep -E "  rocm"[0m
2025-12-04T11:11:20.2023206Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.2023349Z env:
2025-12-04T11:11:20.2023438Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.2023544Z ##[endgroup]
2025-12-04T11:11:20.2087030Z ii  rocm-cmake                   0.14.0.60401-83~22.04                   amd64        rocm-cmake built using CMake
2025-12-04T11:11:20.2087263Z ii  rocm-core                    6.4.1.60401-83~22.04                    amd64        ROCm Runtime software stack
2025-12-04T11:11:20.2087509Z ii  rocm-dbgapi                  0.77.2.60401-83~22.04                   amd64        Library to provide AMD GPU debugger API
2025-12-04T11:11:20.2087761Z ii  rocm-debug-agent             2.0.4.60401-83~22.04                    amd64        Radeon Open Compute Debug Agent (ROCdebug-agent)
2025-12-04T11:11:20.2088009Z ii  rocm-dev                     6.4.1.60401-83~22.04                    amd64        Radeon Open Compute (ROCm) Runtime software stack
2025-12-04T11:11:20.2088586Z ii  rocm-device-libs             1.0.0.60401-83~22.04                    amd64        Radeon Open Compute - device libraries
2025-12-04T11:11:20.2088793Z ii  rocm-gdb                     15.2.60401-83~22.04                     amd64        ROCgdb
2025-12-04T11:11:20.2088989Z ii  rocm-llvm                    19.0.0.25184.60401-83~22.04             amd64        ROCm core compiler
2025-12-04T11:11:20.2089197Z ii  rocm-opencl                  2.0.0.60401-83~22.04                    amd64        clr built using CMake
2025-12-04T11:11:20.2089539Z ii  rocm-opencl-dev              2.0.0.60401-83~22.04                    amd64        clr built using CMake
2025-12-04T11:11:20.2089887Z ii  rocm-smi-lib                 7.5.0.60401-83~22.04                    amd64        AMD System Management libraries
2025-12-04T11:11:20.2090183Z ii  rocm-utils                   6.4.1.60401-83~22.04                    amd64        Radeon Open Compute (ROCm) Runtime software stack
2025-12-04T11:11:20.2090427Z ii  rocminfo                     1.0.0.60401-83~22.04                    amd64        Radeon Open Compute (ROCm) Runtime rocminfo tool
2025-12-04T11:11:20.2109411Z ##[group]Run # ignore expansion of "docker ps -q" since it could be empty
2025-12-04T11:11:20.2109754Z [36;1m# ignore expansion of "docker ps -q" since it could be empty[0m
2025-12-04T11:11:20.2109953Z [36;1m# shellcheck disable=SC2046[0m
2025-12-04T11:11:20.2110135Z [36;1mdocker stop $(docker ps -q) || true[0m
2025-12-04T11:11:20.2110300Z [36;1m# Prune all stopped containers.[0m
2025-12-04T11:11:20.2110620Z [36;1mdocker container prune -f[0m
2025-12-04T11:11:20.2115736Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.2115922Z env:
2025-12-04T11:11:20.2116036Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.2116178Z ##[endgroup]
2025-12-04T11:11:20.2375698Z docker: 'docker stop' requires at least 1 argument
2025-12-04T11:11:20.2376054Z 
2025-12-04T11:11:20.2376246Z Usage:  docker stop [OPTIONS] CONTAINER [CONTAINER...]
2025-12-04T11:11:20.2376523Z 
2025-12-04T11:11:20.2376700Z See 'docker stop --help' for more information
2025-12-04T11:11:20.2478368Z Total reclaimed space: 0B
2025-12-04T11:11:20.2507005Z ##[group]Run cat /etc/os-release || true
2025-12-04T11:11:20.2507230Z [36;1mcat /etc/os-release || true[0m
2025-12-04T11:11:20.2507430Z [36;1mcat /etc/apt/sources.list.d/rocm.list || true[0m
2025-12-04T11:11:20.2507841Z [36;1mcat /opt/rocm/.info/version || true[0m
2025-12-04T11:11:20.2508019Z [36;1mwhoami[0m
2025-12-04T11:11:20.2513465Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.2513633Z env:
2025-12-04T11:11:20.2513729Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.2513850Z ##[endgroup]
2025-12-04T11:11:20.2533855Z PRETTY_NAME="Ubuntu 22.04.5 LTS"
2025-12-04T11:11:20.2534172Z NAME="Ubuntu"
2025-12-04T11:11:20.2534370Z VERSION_ID="22.04"
2025-12-04T11:11:20.2534606Z VERSION="22.04.5 LTS (Jammy Jellyfish)"
2025-12-04T11:11:20.2534878Z VERSION_CODENAME=jammy
2025-12-04T11:11:20.2535084Z ID=ubuntu
2025-12-04T11:11:20.2535266Z ID_LIKE=debian
2025-12-04T11:11:20.2535503Z HOME_URL="https://www.ubuntu.com/"
2025-12-04T11:11:20.2535798Z SUPPORT_URL="https://help.ubuntu.com/"
2025-12-04T11:11:20.2536134Z BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
2025-12-04T11:11:20.2536603Z PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
2025-12-04T11:11:20.2537029Z UBUNTU_CODENAME=jammy
2025-12-04T11:11:20.2542256Z deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.4.1 jammy main
2025-12-04T11:11:20.2549659Z 6.4.1-83
2025-12-04T11:11:20.2555846Z runner
2025-12-04T11:11:20.2576487Z ##[group]Run dpkg -l | grep -E "  amdgpu"
2025-12-04T11:11:20.2576688Z [36;1mdpkg -l | grep -E "  amdgpu"[0m
2025-12-04T11:11:20.2581389Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.2581539Z env:
2025-12-04T11:11:20.2581629Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.2581734Z ##[endgroup]
2025-12-04T11:11:20.2629967Z ii  amdgpu-core                  1:6.4.60401-2164967.22.04               all          Core meta package for unified amdgpu driver.
2025-12-04T11:11:20.2630220Z ii  amdgpu-install               6.4.60401-2164967.22.04                 all          AMDGPU driver repository and installer
2025-12-04T11:11:20.2651408Z ##[group]Run rocm-smi
2025-12-04T11:11:20.2651586Z [36;1mrocm-smi[0m
2025-12-04T11:11:20.2656508Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.2656712Z env:
2025-12-04T11:11:20.2656817Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.2656925Z ##[endgroup]
2025-12-04T11:11:20.3387354Z 
2025-12-04T11:11:20.3387461Z 
2025-12-04T11:11:20.3387727Z ============================================ ROCm System Management Interface ============================================
2025-12-04T11:11:20.3388007Z ====================================================== Concise Info ======================================================
2025-12-04T11:11:20.3388316Z Device  Node  IDs              Temp        Power     Partitions          SCLK  MCLK    Fan  Perf    PwrCap   VRAM%  GPU%  
2025-12-04T11:11:20.3388969Z [3m              (DID,     GUID)  (Junction)  (Socket)  (Mem, Compute, ID)                                                   [0m
2025-12-04T11:11:20.3389205Z ==========================================================================================================================
2025-12-04T11:11:20.3389747Z 0       7     0x74a5,   26567  27.0°C      114.0W    NPS1, SPX, 0        N/A   900Mhz  0%   manual  1000.0W  0%     0%    
2025-12-04T11:11:20.3390259Z 1       9     0x74a5,   43978  28.0°C      118.0W    NPS1, SPX, 0        N/A   900Mhz  0%   manual  1000.0W  0%     0%    
2025-12-04T11:11:20.3390559Z 2       8     0x74a5,   20463  28.0°C      116.0W    NPS1, SPX, 0        N/A   900Mhz  0%   manual  1000.0W  0%     0%    
2025-12-04T11:11:20.3390858Z 3       6     0x74a5,   33762  27.0°C      117.0W    NPS1, SPX, 0        N/A   900Mhz  0%   manual  1000.0W  0%     0%    
2025-12-04T11:11:20.3391068Z ==========================================================================================================================
2025-12-04T11:11:20.3391258Z ================================================== End of ROCm SMI Log ===================================================
2025-12-04T11:11:20.3455049Z ##[group]Run rocminfo
2025-12-04T11:11:20.3455226Z [36;1mrocminfo[0m
2025-12-04T11:11:20.3460905Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.3461069Z env:
2025-12-04T11:11:20.3461196Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.3461308Z ##[endgroup]
2025-12-04T11:11:20.4449668Z [37mROCk module version 6.12.12 is loaded[0m
2025-12-04T11:11:20.4449871Z =====================    
2025-12-04T11:11:20.4450084Z HSA System Attributes    
2025-12-04T11:11:20.4450225Z =====================    
2025-12-04T11:11:20.4450365Z Runtime Version:         1.15
2025-12-04T11:11:20.4450541Z Runtime Ext Version:     1.7
2025-12-04T11:11:20.4450695Z System Timestamp Freq.:  1000.000000MHz
2025-12-04T11:11:20.4450955Z Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
2025-12-04T11:11:20.4451311Z Machine Model:           LARGE                              
2025-12-04T11:11:20.4451537Z System Endianness:       LITTLE                             
2025-12-04T11:11:20.4451760Z Mwaitx:                  DISABLED
2025-12-04T11:11:20.4451920Z XNACK enabled:           NO
2025-12-04T11:11:20.4452069Z DMAbuf Support:          YES
2025-12-04T11:11:20.4452217Z VMM Support:             YES
2025-12-04T11:11:20.4452312Z 
2025-12-04T11:11:20.4452371Z ==========               
2025-12-04T11:11:20.4475801Z HSA Agents               
2025-12-04T11:11:20.4475974Z ==========               
2025-12-04T11:11:20.4476076Z *******                  
2025-12-04T11:11:20.4476183Z Agent 1                  
2025-12-04T11:11:20.4476281Z *******                  
2025-12-04T11:11:20.4476415Z   Name:                    AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:11:20.4476619Z   Uuid:                    CPU-XX                             
2025-12-04T11:11:20.4476779Z   Marketing Name:          AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:11:20.4476995Z   Vendor Name:             CPU                                
2025-12-04T11:11:20.4477159Z   Feature:                 None specified                     
2025-12-04T11:11:20.4477326Z   Profile:                 FULL_PROFILE                       
2025-12-04T11:11:20.4477508Z   Float Round Mode:        NEAR                               
2025-12-04T11:11:20.4477681Z   Max Queue Number:        0(0x0)                             
2025-12-04T11:11:20.4477846Z   Queue Min Size:          0(0x0)                             
2025-12-04T11:11:20.4478006Z   Queue Max Size:          0(0x0)                             
2025-12-04T11:11:20.4478221Z   Queue Type:              MULTI                              
2025-12-04T11:11:20.4478375Z   Node:                    0                                  
2025-12-04T11:11:20.4478538Z   Device Type:             CPU                                
2025-12-04T11:11:20.4478694Z   Cache Info:              
2025-12-04T11:11:20.4478871Z     L1:                      49152(0xc000) KB                   
2025-12-04T11:11:20.4479014Z   Chip ID:                 0(0x0)                             
2025-12-04T11:11:20.4479168Z   ASIC Revision:           0(0x0)                             
2025-12-04T11:11:20.4479349Z   Cacheline Size:          64(0x40)                           
2025-12-04T11:11:20.4479505Z   Max Clock Freq. (MHz):   3300                               
2025-12-04T11:11:20.4479833Z   BDFID:                   0                                  
2025-12-04T11:11:20.4480011Z   Internal Node ID:        0                                  
2025-12-04T11:11:20.4480258Z   Compute Unit:            128                                
2025-12-04T11:11:20.4480427Z   SIMDs per CU:            0                                  
2025-12-04T11:11:20.4480589Z   Shader Engines:          0                                  
2025-12-04T11:11:20.4480746Z   Shader Arrs. per Eng.:   0                                  
2025-12-04T11:11:20.4480916Z   WatchPts on Addr. Ranges:1                                  
2025-12-04T11:11:20.4481076Z   Memory Properties:       
2025-12-04T11:11:20.4481197Z   Features:                None
2025-12-04T11:11:20.4481316Z   Pool Info:               
2025-12-04T11:11:20.4481514Z     Pool 1                   
2025-12-04T11:11:20.4481660Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:11:20.4481832Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:11:20.4481995Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4482161Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4482327Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4482499Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4482672Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4482813Z     Pool 2                   
2025-12-04T11:11:20.4482978Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:11:20.4483159Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:11:20.4483351Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4483516Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4483681Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4483854Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4484021Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4484158Z     Pool 3                   
2025-12-04T11:11:20.4484309Z       Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
2025-12-04T11:11:20.4484499Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:11:20.4484648Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4484811Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4484990Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4485159Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4485328Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4485510Z     Pool 4                   
2025-12-04T11:11:20.4485651Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:11:20.4485813Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:11:20.4485982Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4486171Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4486341Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4486504Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4486681Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4486819Z   ISA Info:                
2025-12-04T11:11:20.4486931Z *******                  
2025-12-04T11:11:20.4487065Z Agent 2                  
2025-12-04T11:11:20.4487197Z *******                  
2025-12-04T11:11:20.4487322Z   Name:                    AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:11:20.4487518Z   Uuid:                    CPU-XX                             
2025-12-04T11:11:20.4487691Z   Marketing Name:          AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:11:20.4487861Z   Vendor Name:             CPU                                
2025-12-04T11:11:20.4488014Z   Feature:                 None specified                     
2025-12-04T11:11:20.4488236Z   Profile:                 FULL_PROFILE                       
2025-12-04T11:11:20.4488400Z   Float Round Mode:        NEAR                               
2025-12-04T11:11:20.4488557Z   Max Queue Number:        0(0x0)                             
2025-12-04T11:11:20.4488724Z   Queue Min Size:          0(0x0)                             
2025-12-04T11:11:20.4488881Z   Queue Max Size:          0(0x0)                             
2025-12-04T11:11:20.4489077Z   Queue Type:              MULTI                              
2025-12-04T11:11:20.4489244Z   Node:                    1                                  
2025-12-04T11:11:20.4489402Z   Device Type:             CPU                                
2025-12-04T11:11:20.4489552Z   Cache Info:              
2025-12-04T11:11:20.4489719Z     L1:                      49152(0xc000) KB                   
2025-12-04T11:11:20.4489889Z   Chip ID:                 0(0x0)                             
2025-12-04T11:11:20.4490043Z   ASIC Revision:           0(0x0)                             
2025-12-04T11:11:20.4490242Z   Cacheline Size:          64(0x40)                           
2025-12-04T11:11:20.4490417Z   Max Clock Freq. (MHz):   3300                               
2025-12-04T11:11:20.4490572Z   BDFID:                   0                                  
2025-12-04T11:11:20.4490721Z   Internal Node ID:        1                                  
2025-12-04T11:11:20.4490881Z   Compute Unit:            128                                
2025-12-04T11:11:20.4491039Z   SIMDs per CU:            0                                  
2025-12-04T11:11:20.4491191Z   Shader Engines:          0                                  
2025-12-04T11:11:20.4491360Z   Shader Arrs. per Eng.:   0                                  
2025-12-04T11:11:20.4491529Z   WatchPts on Addr. Ranges:1                                  
2025-12-04T11:11:20.4491673Z   Memory Properties:       
2025-12-04T11:11:20.4491789Z   Features:                None
2025-12-04T11:11:20.4491899Z   Pool Info:               
2025-12-04T11:11:20.4492010Z     Pool 1                   
2025-12-04T11:11:20.4492149Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:11:20.4492300Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:11:20.4492457Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4492621Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4492785Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4492958Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4493124Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4493261Z     Pool 2                   
2025-12-04T11:11:20.4493399Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:11:20.4493551Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:11:20.4493705Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4493869Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4494031Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4494201Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4494367Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4494507Z     Pool 3                   
2025-12-04T11:11:20.4494643Z       Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
2025-12-04T11:11:20.4494829Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:11:20.4494981Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4495133Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4495285Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4495441Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4495593Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4495722Z     Pool 4                   
2025-12-04T11:11:20.4495846Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:11:20.4495986Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:11:20.4496173Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4496327Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4496483Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:11:20.4496639Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4496794Z       Accessible by all:       TRUE                               
2025-12-04T11:11:20.4496924Z   ISA Info:                
2025-12-04T11:11:20.4497019Z *******                  
2025-12-04T11:11:20.4497111Z Agent 3                  
2025-12-04T11:11:20.4497204Z *******                  
2025-12-04T11:11:20.4497309Z   Name:                    gfx942                             
2025-12-04T11:11:20.4497443Z   Uuid:                    GPU-e92b40ee81585045               
2025-12-04T11:11:20.4497590Z   Marketing Name:          AMD Instinct MI325X                
2025-12-04T11:11:20.4497739Z   Vendor Name:             AMD                                
2025-12-04T11:11:20.4497883Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:11:20.4498027Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:11:20.4498213Z   Float Round Mode:        NEAR                               
2025-12-04T11:11:20.4498364Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:11:20.4498509Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:11:20.4498650Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:11:20.4498797Z   Queue Type:              MULTI                              
2025-12-04T11:11:20.4498934Z   Node:                    2                                  
2025-12-04T11:11:20.4499068Z   Device Type:             GPU                                
2025-12-04T11:11:20.4499195Z   Cache Info:              
2025-12-04T11:11:20.4499302Z     L1:                      32(0x20) KB                        
2025-12-04T11:11:20.4499428Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:11:20.4499560Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:11:20.4499689Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:11:20.4499831Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:11:20.4499983Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:11:20.4500127Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:11:20.4500269Z   BDFID:                   62720                              
2025-12-04T11:11:20.4500414Z   Internal Node ID:        2                                  
2025-12-04T11:11:20.4500558Z   Compute Unit:            304                                
2025-12-04T11:11:20.4500705Z   SIMDs per CU:            4                                  
2025-12-04T11:11:20.4500853Z   Shader Engines:          32                                 
2025-12-04T11:11:20.4501008Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:11:20.4501171Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:11:20.4501378Z   Coherent Host Access:    FALSE                              
2025-12-04T11:11:20.4501519Z   Memory Properties:       
2025-12-04T11:11:20.4501634Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:11:20.4501768Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:11:20.4501919Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:11:20.4502065Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4502202Z   Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4502319Z     x                        1024(0x400)                        
2025-12-04T11:11:20.4502439Z     y                        1024(0x400)                        
2025-12-04T11:11:20.4502559Z     z                        1024(0x400)                        
2025-12-04T11:11:20.4502694Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:11:20.4502875Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:11:20.4503037Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4503169Z   Grid Max Size per Dimension:
2025-12-04T11:11:20.4503277Z     x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4503401Z     y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4503525Z     z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4503664Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:11:20.4509248Z   Packet Processor uCode:: 185                                
2025-12-04T11:11:20.4509413Z   SDMA engine uCode::      24                                 
2025-12-04T11:11:20.4509566Z   IOMMU Support::          None                               
2025-12-04T11:11:20.4509699Z   Pool Info:               
2025-12-04T11:11:20.4509799Z     Pool 1                   
2025-12-04T11:11:20.4509932Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:11:20.4510081Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4510230Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4510384Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4510540Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4510699Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4510856Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4510986Z     Pool 2                   
2025-12-04T11:11:20.4511113Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:11:20.4511258Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4511398Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4511550Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4511707Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4511863Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4512017Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4512147Z     Pool 3                   
2025-12-04T11:11:20.4512268Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:11:20.4512411Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4512551Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4512703Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4512860Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4513016Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4513171Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4513300Z     Pool 4                   
2025-12-04T11:11:20.4513505Z       Segment:                 GROUP                              
2025-12-04T11:11:20.4513638Z       Size:                    64(0x40) KB                        
2025-12-04T11:11:20.4513775Z       Allocatable:             FALSE                              
2025-12-04T11:11:20.4513922Z       Alloc Granule:           0KB                                
2025-12-04T11:11:20.4514078Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:11:20.4514233Z       Alloc Alignment:         0KB                                
2025-12-04T11:11:20.4514387Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4514518Z   ISA Info:                
2025-12-04T11:11:20.4514618Z     ISA 1                    
2025-12-04T11:11:20.4514787Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:11:20.4514948Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4515109Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4515265Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4515420Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4515571Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4515719Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4515857Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4515985Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4516114Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4516241Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4516386Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4516517Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4516635Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4516767Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4516888Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4517034Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4517165Z     ISA 2                    
2025-12-04T11:11:20.4517304Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:11:20.4517480Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4517634Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4517785Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4517941Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4518087Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4518270Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4518409Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4518533Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4518659Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4518780Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4518918Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4519056Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4519172Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4519303Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4519431Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4519574Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4519709Z *******                  
2025-12-04T11:11:20.4519848Z Agent 4                  
2025-12-04T11:11:20.4519947Z *******                  
2025-12-04T11:11:20.4520061Z   Name:                    gfx942                             
2025-12-04T11:11:20.4520202Z   Uuid:                    GPU-0f23c118dd1bca7f               
2025-12-04T11:11:20.4520358Z   Marketing Name:          AMD Instinct MI325X                
2025-12-04T11:11:20.4520517Z   Vendor Name:             AMD                                
2025-12-04T11:11:20.4520664Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:11:20.4520815Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:11:20.4520963Z   Float Round Mode:        NEAR                               
2025-12-04T11:11:20.4521118Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:11:20.4521310Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:11:20.4521457Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:11:20.4521610Z   Queue Type:              MULTI                              
2025-12-04T11:11:20.4521757Z   Node:                    3                                  
2025-12-04T11:11:20.4521897Z   Device Type:             GPU                                
2025-12-04T11:11:20.4522032Z   Cache Info:              
2025-12-04T11:11:20.4522145Z     L1:                      32(0x20) KB                        
2025-12-04T11:11:20.4522283Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:11:20.4522416Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:11:20.4522548Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:11:20.4522695Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:11:20.4522851Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:11:20.4523002Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:11:20.4523154Z   BDFID:                   34048                              
2025-12-04T11:11:20.4523295Z   Internal Node ID:        3                                  
2025-12-04T11:11:20.4523447Z   Compute Unit:            304                                
2025-12-04T11:11:20.4523595Z   SIMDs per CU:            4                                  
2025-12-04T11:11:20.4523742Z   Shader Engines:          32                                 
2025-12-04T11:11:20.4523898Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:11:20.4524055Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:11:20.4524212Z   Coherent Host Access:    FALSE                              
2025-12-04T11:11:20.4524353Z   Memory Properties:       
2025-12-04T11:11:20.4524464Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:11:20.4524610Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:11:20.4524767Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:11:20.4524919Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4525061Z   Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4525184Z     x                        1024(0x400)                        
2025-12-04T11:11:20.4525308Z     y                        1024(0x400)                        
2025-12-04T11:11:20.4525433Z     z                        1024(0x400)                        
2025-12-04T11:11:20.4525571Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:11:20.4525722Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:11:20.4525876Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4526008Z   Grid Max Size per Dimension:
2025-12-04T11:11:20.4526124Z     x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4526256Z     y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4526382Z     z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4527137Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:11:20.4527303Z   Packet Processor uCode:: 185                                
2025-12-04T11:11:20.4527460Z   SDMA engine uCode::      24                                 
2025-12-04T11:11:20.4527616Z   IOMMU Support::          None                               
2025-12-04T11:11:20.4527750Z   Pool Info:               
2025-12-04T11:11:20.4527859Z     Pool 1                   
2025-12-04T11:11:20.4527989Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:11:20.4528137Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4528328Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4528484Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4528686Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4528854Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4529016Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4529152Z     Pool 2                   
2025-12-04T11:11:20.4529281Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:11:20.4529426Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4529573Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4529727Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4529884Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4530047Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4530204Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4530340Z     Pool 3                   
2025-12-04T11:11:20.4530473Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:11:20.4530618Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4530764Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4530919Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4531076Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4531239Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4531395Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4531526Z     Pool 4                   
2025-12-04T11:11:20.4531646Z       Segment:                 GROUP                              
2025-12-04T11:11:20.4531783Z       Size:                    64(0x40) KB                        
2025-12-04T11:11:20.4531926Z       Allocatable:             FALSE                              
2025-12-04T11:11:20.4532074Z       Alloc Granule:           0KB                                
2025-12-04T11:11:20.4532235Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:11:20.4532390Z       Alloc Alignment:         0KB                                
2025-12-04T11:11:20.4532546Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4532683Z   ISA Info:                
2025-12-04T11:11:20.4532779Z     ISA 1                    
2025-12-04T11:11:20.4532908Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:11:20.4533068Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4533232Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4533396Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4533559Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4533710Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4533898Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4534043Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4534173Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4534301Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4534433Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4534574Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4534707Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4534826Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4534954Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4535109Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4535254Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4535389Z     ISA 2                    
2025-12-04T11:11:20.4535531Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:11:20.4535704Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4535859Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4536019Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4536182Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4536330Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4536480Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4536618Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4536744Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4536874Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4536999Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4537141Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4537279Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4537395Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4537527Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4537652Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4537796Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4537928Z *******                  
2025-12-04T11:11:20.4538023Z Agent 5                  
2025-12-04T11:11:20.4538122Z *******                  
2025-12-04T11:11:20.4538263Z   Name:                    gfx942                             
2025-12-04T11:11:20.4538407Z   Uuid:                    GPU-1385052698a87313               
2025-12-04T11:11:20.4538561Z   Marketing Name:          AMD Instinct MI325X                
2025-12-04T11:11:20.4538717Z   Vendor Name:             AMD                                
2025-12-04T11:11:20.4538867Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:11:20.4539017Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:11:20.4539166Z   Float Round Mode:        NEAR                               
2025-12-04T11:11:20.4539321Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:11:20.4539472Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:11:20.4539619Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:11:20.4539769Z   Queue Type:              MULTI                              
2025-12-04T11:11:20.4539907Z   Node:                    4                                  
2025-12-04T11:11:20.4540053Z   Device Type:             GPU                                
2025-12-04T11:11:20.4540187Z   Cache Info:              
2025-12-04T11:11:20.4540348Z     L1:                      32(0x20) KB                        
2025-12-04T11:11:20.4540481Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:11:20.4540614Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:11:20.4540746Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:11:20.4540894Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:11:20.4541047Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:11:20.4541195Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:11:20.4541340Z   BDFID:                   58624                              
2025-12-04T11:11:20.4541484Z   Internal Node ID:        4                                  
2025-12-04T11:11:20.4541679Z   Compute Unit:            304                                
2025-12-04T11:11:20.4541828Z   SIMDs per CU:            4                                  
2025-12-04T11:11:20.4541983Z   Shader Engines:          32                                 
2025-12-04T11:11:20.4542142Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:11:20.4542301Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:11:20.4542460Z   Coherent Host Access:    FALSE                              
2025-12-04T11:11:20.4542602Z   Memory Properties:       
2025-12-04T11:11:20.4542718Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:11:20.4542867Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:11:20.4543026Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:11:20.4543180Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4543327Z   Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4543459Z     x                        1024(0x400)                        
2025-12-04T11:11:20.4543589Z     y                        1024(0x400)                        
2025-12-04T11:11:20.4543722Z     z                        1024(0x400)                        
2025-12-04T11:11:20.4543859Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:11:20.4544023Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:11:20.4544179Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4544315Z   Grid Max Size per Dimension:
2025-12-04T11:11:20.4544435Z     x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4544567Z     y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4544694Z     z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4544846Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:11:20.4545012Z   Packet Processor uCode:: 185                                
2025-12-04T11:11:20.4545181Z   SDMA engine uCode::      24                                 
2025-12-04T11:11:20.4545342Z   IOMMU Support::          None                               
2025-12-04T11:11:20.4545479Z   Pool Info:               
2025-12-04T11:11:20.4545590Z     Pool 1                   
2025-12-04T11:11:20.4545725Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:11:20.4545875Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4546030Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4546194Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4546360Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4546527Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4546686Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4546827Z     Pool 2                   
2025-12-04T11:11:20.4546967Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:11:20.4547115Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4547298Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4547459Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4547621Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4547789Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4547946Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4548088Z     Pool 3                   
2025-12-04T11:11:20.4548271Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:11:20.4548416Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4548568Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4548775Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4548937Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4549111Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4549277Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4549415Z     Pool 4                   
2025-12-04T11:11:20.4549546Z       Segment:                 GROUP                              
2025-12-04T11:11:20.4549686Z       Size:                    64(0x40) KB                        
2025-12-04T11:11:20.4549838Z       Allocatable:             FALSE                              
2025-12-04T11:11:20.4549999Z       Alloc Granule:           0KB                                
2025-12-04T11:11:20.4550158Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:11:20.4550324Z       Alloc Alignment:         0KB                                
2025-12-04T11:11:20.4550490Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4550625Z   ISA Info:                
2025-12-04T11:11:20.4550734Z     ISA 1                    
2025-12-04T11:11:20.4550860Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:11:20.4551025Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4551186Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4551342Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4551507Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4551659Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4551807Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4551951Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4552078Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4552211Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4552342Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4552480Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4552622Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4552743Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4552872Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4553002Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4553147Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4553278Z     ISA 2                    
2025-12-04T11:11:20.4553415Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:11:20.4553584Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4553741Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4553942Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4554101Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4554257Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4554408Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4554698Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4554961Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4555217Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4555350Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4555487Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4555630Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4555792Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4555921Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4556057Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4556204Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4556339Z *******                  
2025-12-04T11:11:20.4556439Z Agent 6                  
2025-12-04T11:11:20.4556534Z *******                  
2025-12-04T11:11:20.4556649Z   Name:                    gfx942                             
2025-12-04T11:11:20.4556799Z   Uuid:                    GPU-7b47bcc6019ee30a               
2025-12-04T11:11:20.4556953Z   Marketing Name:          AMD Instinct MI325X                
2025-12-04T11:11:20.4557116Z   Vendor Name:             AMD                                
2025-12-04T11:11:20.4557268Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:11:20.4557426Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:11:20.4557581Z   Float Round Mode:        NEAR                               
2025-12-04T11:11:20.4557736Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:11:20.4557889Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:11:20.4558040Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:11:20.4558221Z   Queue Type:              MULTI                              
2025-12-04T11:11:20.4558367Z   Node:                    5                                  
2025-12-04T11:11:20.4558506Z   Device Type:             GPU                                
2025-12-04T11:11:20.4558639Z   Cache Info:              
2025-12-04T11:11:20.4558754Z     L1:                      32(0x20) KB                        
2025-12-04T11:11:20.4558882Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:11:20.4559015Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:11:20.4559151Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:11:20.4559296Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:11:20.4559449Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:11:20.4559598Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:11:20.4559745Z   BDFID:                   38144                              
2025-12-04T11:11:20.4559892Z   Internal Node ID:        5                                  
2025-12-04T11:11:20.4560040Z   Compute Unit:            304                                
2025-12-04T11:11:20.4560191Z   SIMDs per CU:            4                                  
2025-12-04T11:11:20.4560343Z   Shader Engines:          32                                 
2025-12-04T11:11:20.4560496Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:11:20.4560658Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:11:20.4560819Z   Coherent Host Access:    FALSE                              
2025-12-04T11:11:20.4561010Z   Memory Properties:       
2025-12-04T11:11:20.4561127Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:11:20.4561266Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:11:20.4561426Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:11:20.4561583Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4561722Z   Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4561851Z     x                        1024(0x400)                        
2025-12-04T11:11:20.4561980Z     y                        1024(0x400)                        
2025-12-04T11:11:20.4562102Z     z                        1024(0x400)                        
2025-12-04T11:11:20.4562241Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:11:20.4562435Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:11:20.4562591Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4562731Z   Grid Max Size per Dimension:
2025-12-04T11:11:20.4562843Z     x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4562974Z     y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4563105Z     z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4563247Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:11:20.4563412Z   Packet Processor uCode:: 185                                
2025-12-04T11:11:20.4563570Z   SDMA engine uCode::      24                                 
2025-12-04T11:11:20.4563726Z   IOMMU Support::          None                               
2025-12-04T11:11:20.4563866Z   Pool Info:               
2025-12-04T11:11:20.4563967Z     Pool 1                   
2025-12-04T11:11:20.4564098Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:11:20.4564254Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4564403Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4564560Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4564724Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4564884Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4565045Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4565178Z     Pool 2                   
2025-12-04T11:11:20.4565309Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:11:20.4565458Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4565601Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4565758Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4565920Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4566080Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4566240Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4566373Z     Pool 3                   
2025-12-04T11:11:20.4566500Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:11:20.4566647Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:11:20.4566789Z       Allocatable:             TRUE                               
2025-12-04T11:11:20.4566943Z       Alloc Granule:           4KB                                
2025-12-04T11:11:20.4567104Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:11:20.4567263Z       Alloc Alignment:         4KB                                
2025-12-04T11:11:20.4567420Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4567558Z     Pool 4                   
2025-12-04T11:11:20.4567678Z       Segment:                 GROUP                              
2025-12-04T11:11:20.4567858Z       Size:                    64(0x40) KB                        
2025-12-04T11:11:20.4568000Z       Allocatable:             FALSE                              
2025-12-04T11:11:20.4568196Z       Alloc Granule:           0KB                                
2025-12-04T11:11:20.4568358Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:11:20.4568516Z       Alloc Alignment:         0KB                                
2025-12-04T11:11:20.4568674Z       Accessible by all:       FALSE                              
2025-12-04T11:11:20.4568811Z   ISA Info:                
2025-12-04T11:11:20.4568911Z     ISA 1                    
2025-12-04T11:11:20.4569043Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:11:20.4569247Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4569410Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4569577Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4569739Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4569892Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4570045Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4570185Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4570314Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4570441Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4570570Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4570712Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4570847Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4570974Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4571111Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4571240Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4571389Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4571524Z     ISA 2                    
2025-12-04T11:11:20.4571663Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:11:20.4571838Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:11:20.4572003Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:11:20.4572166Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4572348Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:11:20.4572502Z       Fast f16:                TRUE                               
2025-12-04T11:11:20.4572654Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:11:20.4572799Z       Workgroup Max Size per Dimension:
2025-12-04T11:11:20.4572922Z         x                        1024(0x400)                        
2025-12-04T11:11:20.4573052Z         y                        1024(0x400)                        
2025-12-04T11:11:20.4573177Z         z                        1024(0x400)                        
2025-12-04T11:11:20.4573314Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:11:20.4573610Z       Grid Max Size per Dimension:
2025-12-04T11:11:20.4573728Z         x                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4573861Z         y                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4573990Z         z                        4294967295(0xffffffff)             
2025-12-04T11:11:20.4574130Z       FBarrier Max Size:       32                                 
2025-12-04T11:11:20.4574269Z *** Done ***             
2025-12-04T11:11:20.4584137Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx')
2025-12-04T11:11:20.4584506Z [36;1mngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx')[0m
2025-12-04T11:11:20.4584783Z [36;1mmsg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified"[0m
2025-12-04T11:11:20.4585046Z [36;1mif [[ $ngpu -eq 0 ]]; then[0m
2025-12-04T11:11:20.4585193Z [36;1m    echo "Error: Failed to detect any GPUs on the runner"[0m
2025-12-04T11:11:20.4585332Z [36;1m    echo "$msg"[0m
2025-12-04T11:11:20.4585432Z [36;1m    exit 1[0m
2025-12-04T11:11:20.4585524Z [36;1mfi[0m
2025-12-04T11:11:20.4588276Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.4588421Z env:
2025-12-04T11:11:20.4588506Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.4588610Z ##[endgroup]
2025-12-04T11:11:20.5718540Z ##[group]Run pytorch/pytorch/.github/actions/diskspace-cleanup@main
2025-12-04T11:11:20.5718728Z with:
2025-12-04T11:11:20.5718866Z   diskspace-cutoff: 70
2025-12-04T11:11:20.5718979Z env:
2025-12-04T11:11:20.5719080Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.5719187Z ##[endgroup]
2025-12-04T11:11:20.5746807Z ##[group]Run set -ex
2025-12-04T11:11:20.5747033Z [36;1mset -ex[0m
2025-12-04T11:11:20.5747150Z [36;1mdiskspace_cutoff=70[0m
2025-12-04T11:11:20.5772791Z [36;1mdocker_root_dir=$(docker info -f '{{.DockerRootDir}}')[0m
2025-12-04T11:11:20.5772967Z [36;1mif [ ! -d "$docker_root_dir" ]; then[0m
2025-12-04T11:11:20.5773180Z [36;1m    echo "Docker root directory ($docker_root_dir) does not exist. Skipping disk space check."[0m
2025-12-04T11:11:20.5773376Z [36;1m    exit 0[0m
2025-12-04T11:11:20.5773471Z [36;1mfi[0m
2025-12-04T11:11:20.5773645Z [36;1mdiskspace=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //')[0m
2025-12-04T11:11:20.5773982Z [36;1mmsg="Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified"[0m
2025-12-04T11:11:20.5774265Z [36;1mif [[ "$diskspace" -ge "$diskspace_cutoff" ]] ; then[0m
2025-12-04T11:11:20.5774425Z [36;1m    docker system prune -af[0m
2025-12-04T11:11:20.5774620Z [36;1m    diskspace_new=$(df -H --output=pcent ${docker_root_dir} | sed -n 2p | sed 's/%//' | sed 's/ //')[0m
2025-12-04T11:11:20.5774839Z [36;1m    if [[ "$diskspace_new" -gt "$diskspace_cutoff" ]] ; then[0m
2025-12-04T11:11:20.5775009Z [36;1m        diskspace_cutoff_int=$((diskspace_cutoff + 0))[0m
2025-12-04T11:11:20.5775166Z [36;1m        difference=$((100 - diskspace_cutoff_int))[0m
2025-12-04T11:11:20.5775379Z [36;1m        echo "Error: Available diskspace is less than $difference percent. Not enough diskspace."[0m
2025-12-04T11:11:20.5775574Z [36;1m        echo "$msg"[0m
2025-12-04T11:11:20.5775680Z [36;1m        exit 1[0m
2025-12-04T11:11:20.5775783Z [36;1m    else[0m
2025-12-04T11:11:20.5775905Z [36;1m        difference=$((diskspace - diskspace_new))[0m
2025-12-04T11:11:20.5776060Z [36;1m        echo "Diskspace saved: $difference percent"[0m
2025-12-04T11:11:20.5776203Z [36;1m    fi[0m
2025-12-04T11:11:20.5776289Z [36;1mfi[0m
2025-12-04T11:11:20.5780987Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.5781144Z env:
2025-12-04T11:11:20.5781238Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.5781347Z ##[endgroup]
2025-12-04T11:11:20.5799546Z + diskspace_cutoff=70
2025-12-04T11:11:20.5802160Z ++ docker info -f '{{.DockerRootDir}}'
2025-12-04T11:11:20.6186386Z + docker_root_dir=/home/runner/docker-data
2025-12-04T11:11:20.6186643Z + '[' '!' -d /home/runner/docker-data ']'
2025-12-04T11:11:20.6194765Z ++ df -H --output=pcent /home/runner/docker-data
2025-12-04T11:11:20.6195227Z ++ sed -n 2p
2025-12-04T11:11:20.6195467Z ++ sed s/%//
2025-12-04T11:11:20.6195696Z ++ sed 's/ //'
2025-12-04T11:11:20.6210979Z + diskspace=' 4'
2025-12-04T11:11:20.6211579Z + msg='Please file an issue on pytorch/pytorch reporting the faulty runner. Include a link to the runner logs so the runner can be identified'
2025-12-04T11:11:20.6212663Z + [[  4 -ge 70 ]]
2025-12-04T11:11:20.6239027Z ##[group]Run RUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts"
2025-12-04T11:11:20.6239266Z [36;1mRUNNER_ARTIFACT_DIR="${RUNNER_TEMP}/artifacts"[0m
2025-12-04T11:11:20.6239443Z [36;1mrm -rf "${RUNNER_ARTIFACT_DIR}"[0m
2025-12-04T11:11:20.6239593Z [36;1mmkdir -p "${RUNNER_ARTIFACT_DIR}"[0m
2025-12-04T11:11:20.6239781Z [36;1mecho "RUNNER_ARTIFACT_DIR=${RUNNER_ARTIFACT_DIR}" >> "${GITHUB_ENV}"[0m
2025-12-04T11:11:20.6239958Z [36;1m[0m
2025-12-04T11:11:20.6240098Z [36;1mRUNNER_TEST_RESULTS_DIR="${RUNNER_TEMP}/test-results"[0m
2025-12-04T11:11:20.6240275Z [36;1mrm -rf "${RUNNER_TEST_RESULTS_DIR}"[0m
2025-12-04T11:11:20.6240422Z [36;1mmkdir -p "${RUNNER_TEST_RESULTS_DIR}"[0m
2025-12-04T11:11:20.6240623Z [36;1mecho "RUNNER_TEST_RESULTS_DIR=${RUNNER_TEST_RESULTS_DIR}" >> "${GITHUB_ENV}"[0m
2025-12-04T11:11:20.6240815Z [36;1m[0m
2025-12-04T11:11:20.6241119Z [36;1mRUNNER_DOCS_DIR="${RUNNER_TEMP}/docs"[0m
2025-12-04T11:11:20.6241265Z [36;1mrm -rf "${RUNNER_DOCS_DIR}"[0m
2025-12-04T11:11:20.6241412Z [36;1mmkdir -p "${RUNNER_DOCS_DIR}"[0m
2025-12-04T11:11:20.6241573Z [36;1mecho "RUNNER_DOCS_DIR=${RUNNER_DOCS_DIR}" >> "${GITHUB_ENV}"[0m
2025-12-04T11:11:20.6245923Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.6246064Z env:
2025-12-04T11:11:20.6246153Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.6246254Z ##[endgroup]
2025-12-04T11:11:20.6330757Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"
2025-12-04T11:11:20.6331071Z [36;1menv | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:11:20.6331329Z [36;1menv | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:11:20.6335727Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.6335888Z env:
2025-12-04T11:11:20.6335996Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.6336144Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:20.6336341Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:20.6336520Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:20.6336662Z ##[endgroup]
2025-12-04T11:11:20.6389407Z ##[group]Run # All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py.
2025-12-04T11:11:20.6389737Z [36;1m# All GPUs are visible to the runner; visibility, if needed, will be set by run_test.py.[0m
2025-12-04T11:11:20.6389953Z [36;1m# Add render group for container creation.[0m
2025-12-04T11:11:20.6390131Z [36;1mrender_gid=`cat /etc/group | grep render | cut -d: -f3`[0m
2025-12-04T11:11:20.6390340Z [36;1m# Ensure GPU isolation if pod is part of kubernetes setup with DEVICE_FLAG.[0m
2025-12-04T11:11:20.6390552Z [36;1mif [ -f "/etc/podinfo/gha-render-devices" ]; then[0m
2025-12-04T11:11:20.6390740Z [36;1m  DEVICE_FLAG=$(cat /etc/podinfo/gha-render-devices)[0m
2025-12-04T11:11:20.6390889Z [36;1melse[0m
2025-12-04T11:11:20.6390996Z [36;1m  DEVICE_FLAG="--device /dev/dri"[0m
2025-12-04T11:11:20.6391130Z [36;1mfi[0m
2025-12-04T11:11:20.6391317Z [36;1m# The --group-add daemon and --group-add bin are needed in the Ubuntu 24.04 and Almalinux OSs respectively.[0m
2025-12-04T11:11:20.6391595Z [36;1m# This is due to the device files (/dev/kfd & /dev/dri) being owned by video group on bare metal.[0m
2025-12-04T11:11:20.6391849Z [36;1m# This video group ID maps to subgid 1 inside the docker image due to the /etc/subgid entries.[0m
2025-12-04T11:11:20.6392121Z [36;1m# The group name corresponding to group ID 1 can change depending on the OS, so both are necessary.[0m
2025-12-04T11:11:20.6392568Z [36;1mecho "GPU_FLAG=--device=/dev/mem --device=/dev/kfd $DEVICE_FLAG --group-add video --group-add $render_gid --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host" >> "${GITHUB_ENV}"[0m
2025-12-04T11:11:20.6397205Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:20.6397493Z env:
2025-12-04T11:11:20.6397585Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.6397713Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:20.6397887Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:20.6398047Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:20.6398221Z ##[endgroup]
2025-12-04T11:11:20.6476243Z ##[group]Run aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722
2025-12-04T11:11:20.6476505Z with:
2025-12-04T11:11:20.6476700Z   role-to-assume: arn:aws:iam::308535385114:role/gha_workflow_s3_and_ecr_read_only
2025-12-04T11:11:20.6476926Z   aws-region: us-east-1
2025-12-04T11:11:20.6477067Z   role-duration-seconds: 18000
2025-12-04T11:11:20.6477227Z   audience: sts.amazonaws.com
2025-12-04T11:11:20.6477369Z env:
2025-12-04T11:11:20.6477484Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:20.6477761Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:20.6477980Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:20.6478274Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:20.6478939Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:20.6479576Z ##[endgroup]
2025-12-04T11:11:20.9482319Z Assuming role with OIDC
2025-12-04T11:11:21.3016143Z Authenticated as assumedRoleId AROAUPVRELQNLLCOPFEJR:GitHubActions
2025-12-04T11:11:21.3966083Z ##[group]Run aws-actions/amazon-ecr-login@062b18b96a7aff071d4dc91bc00c4c1a7945b076
2025-12-04T11:11:21.3966306Z with:
2025-12-04T11:11:21.3966414Z   mask-password: true
2025-12-04T11:11:21.3966549Z   registry-type: private
2025-12-04T11:11:21.3966679Z   skip-logout: false
2025-12-04T11:11:21.3966789Z env:
2025-12-04T11:11:21.3966902Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:21.3967061Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:21.3967262Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:21.3967453Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:21.3968034Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:21.3968818Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:21.3968950Z   AWS_REGION: us-east-1
2025-12-04T11:11:21.3969344Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:21.3969517Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:21.3971663Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:21.3971770Z ##[endgroup]
2025-12-04T11:11:21.8171748Z Logging into registry 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.4450604Z ##[group]Run env | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"
2025-12-04T11:11:22.4450891Z [36;1menv | grep '^GITHUB' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:11:22.4451138Z [36;1menv | grep '^CI' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:11:22.4451364Z [36;1menv | grep '^RUNNER' >> "${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:11:22.4456329Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:22.4456521Z env:
2025-12-04T11:11:22.4456642Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:22.4456824Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:22.4457047Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:22.4457258Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:22.4457921Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:22.4458731Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:22.4458881Z   AWS_REGION: us-east-1
2025-12-04T11:11:22.4459126Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:22.4459317Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:22.4461670Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:22.4461784Z ##[endgroup]
2025-12-04T11:11:22.4548507Z ##[group]Run ngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx')
2025-12-04T11:11:22.4548706Z [36;1mngpu=$(rocminfo | grep -c -E 'Name:.*\sgfx')[0m
2025-12-04T11:11:22.4548960Z [36;1mif [[ $ngpu -lt 2 ]]; then #We are temporarily reducing this down to 2 from 4 so that we can run tests on nodes with less gpus.[0m
2025-12-04T11:11:22.4549253Z [36;1m  echo "Error: only $ngpu GPU(s) detected, at least 2 GPUs are needed for distributed jobs"[0m
2025-12-04T11:11:22.4549445Z [36;1m  exit 1[0m
2025-12-04T11:11:22.4549545Z [36;1mfi[0m
2025-12-04T11:11:22.4552373Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:22.4552521Z env:
2025-12-04T11:11:22.4552622Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:22.4552769Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:22.4552953Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:22.4553124Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:22.4553655Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:22.4554159Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:22.4554285Z   AWS_REGION: us-east-1
2025-12-04T11:11:22.4554451Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:22.4554614Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:22.4556638Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:22.4556752Z ##[endgroup]
2025-12-04T11:11:22.5694269Z ##[group]Run pytorch/test-infra/.github/actions/calculate-docker-image@main
2025-12-04T11:11:22.5694448Z with:
2025-12-04T11:11:22.5694726Z   docker-image-name: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5695031Z   use-custom-docker-registry: true
2025-12-04T11:11:22.5695158Z   docker-build-dir: .ci/docker
2025-12-04T11:11:22.5695279Z   docker-build-script: ./build.sh
2025-12-04T11:11:22.5695398Z   working-directory: .
2025-12-04T11:11:22.5695537Z   docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.5695691Z   force-push: false
2025-12-04T11:11:22.5695785Z env:
2025-12-04T11:11:22.5695876Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:22.5696011Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:22.5696188Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:22.5696360Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:22.5696866Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:22.5697356Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:22.5697469Z   AWS_REGION: us-east-1
2025-12-04T11:11:22.5697612Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:22.5697762Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:22.5699829Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:22.5699951Z ##[endgroup]
2025-12-04T11:11:22.5708220Z ##[group]Run set -ex
2025-12-04T11:11:22.5708345Z [36;1mset -ex[0m
2025-12-04T11:11:22.5708438Z [36;1m[0m
2025-12-04T11:11:22.5708688Z [36;1m# If the docker build directory or the build script doesn't exist, the action will[0m
2025-12-04T11:11:22.5708937Z [36;1m# gracefully return the docker image name as it is.  Pulling docker image in Linux[0m
2025-12-04T11:11:22.5709148Z [36;1m# job could then download the pre-built image as usual[0m
2025-12-04T11:11:22.5709401Z [36;1mif [[ -d "${DOCKER_BUILD_DIR}" ]] && [[ -f "${DOCKER_BUILD_DIR}/${DOCKER_BUILD_SCRIPT}" ]] && [[ "${USE_CUSTOM_DOCKER_REGISTRY}" == "true" ]]; then[0m
2025-12-04T11:11:22.5709636Z [36;1m  echo "skip=false" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5709764Z [36;1melse[0m
2025-12-04T11:11:22.5709871Z [36;1m  echo "skip=true" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5710044Z [36;1m  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5710193Z [36;1m[0m
2025-12-04T11:11:22.5710399Z [36;1m  echo "Not using custom ECR registry.  Either it was not requested or there is no Docker build script in the ${REPO_NAME} repo..."[0m
2025-12-04T11:11:22.5710629Z [36;1m  exit 0[0m
2025-12-04T11:11:22.5710720Z [36;1mfi[0m
2025-12-04T11:11:22.5710807Z [36;1m[0m
2025-12-04T11:11:22.5710942Z [36;1mif [[ "${DOCKER_IMAGE_NAME}" == *"${DOCKER_REGISTRY}/${REPO_NAME}"* ]]; then[0m
2025-12-04T11:11:22.5711173Z [36;1m  # The docker image name already includes the ECR prefix and tag, so we can just[0m
2025-12-04T11:11:22.5711372Z [36;1m  # use it as it is, but first let's extract the tag[0m
2025-12-04T11:11:22.5711558Z [36;1m  DOCKER_TAG=$(echo "${DOCKER_IMAGE_NAME}" | awk -F '[:,]' '{print $2}')[0m
2025-12-04T11:11:22.5711751Z [36;1m  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5711940Z [36;1m  echo "docker-image=${DOCKER_IMAGE_NAME}" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5712094Z [36;1melse[0m
2025-12-04T11:11:22.5712205Z [36;1m  if [[ "${DOCKER_IMAGE_NAME}" == *:* ]]; then[0m
2025-12-04T11:11:22.5712358Z [36;1m    CUSTOM_TAG_PREFIX=${DOCKER_IMAGE_NAME#*:}[0m
2025-12-04T11:11:22.5712515Z [36;1m    DOCKER_IMAGE_NAME=${DOCKER_IMAGE_NAME%%:*}[0m
2025-12-04T11:11:22.5712645Z [36;1m  fi[0m
2025-12-04T11:11:22.5712879Z [36;1m  DOCKER_TAG=${CUSTOM_TAG_PREFIX:+${CUSTOM_TAG_PREFIX}-}$(git rev-parse HEAD:"${DOCKER_BUILD_DIR}")[0m
2025-12-04T11:11:22.5713106Z [36;1m  echo "docker-tag=${DOCKER_TAG}" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5713346Z [36;1m  echo "docker-image=${DOCKER_REGISTRY}/${REPO_NAME}/${DOCKER_IMAGE_NAME}:${DOCKER_TAG}" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5713603Z [36;1m  echo "custom-tag-prefix=${CUSTOM_TAG_PREFIX}" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5713763Z [36;1mfi[0m
2025-12-04T11:11:22.5716421Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:22.5716563Z env:
2025-12-04T11:11:22.5716660Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:22.5716805Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:22.5716984Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:22.5717153Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:22.5717658Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:22.5718182Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:22.5718300Z   AWS_REGION: us-east-1
2025-12-04T11:11:22.5718439Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:22.5718594Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:22.5720596Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:22.5720700Z   REPO_NAME: pytorch
2025-12-04T11:11:22.5720979Z   DOCKER_IMAGE_NAME: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5721278Z   DOCKER_BUILD_DIR: .ci/docker
2025-12-04T11:11:22.5721443Z   DOCKER_BUILD_SCRIPT: ./build.sh
2025-12-04T11:11:22.5721597Z   DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.5721760Z   USE_CUSTOM_DOCKER_REGISTRY: true
2025-12-04T11:11:22.5721881Z   CUSTOM_TAG_PREFIX: 
2025-12-04T11:11:22.5721985Z ##[endgroup]
2025-12-04T11:11:22.5737746Z + [[ -d .ci/docker ]]
2025-12-04T11:11:22.5737922Z + [[ -f .ci/docker/./build.sh ]]
2025-12-04T11:11:22.5738063Z + [[ true == \t\r\u\e ]]
2025-12-04T11:11:22.5738684Z + echo skip=false
2025-12-04T11:11:22.5739162Z + [[ 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a == *\3\0\8\5\3\5\3\8\5\1\1\4\.\d\k\r\.\e\c\r\.\u\s\-\e\a\s\t\-\1\.\a\m\a\z\o\n\a\w\s\.\c\o\m\/\p\y\t\o\r\c\h* ]]
2025-12-04T11:11:22.5746858Z ++ echo 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5749135Z ++ awk -F '[:,]' '{print $2}'
2025-12-04T11:11:22.5762813Z + DOCKER_TAG=pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5763412Z + echo docker-tag=pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5764149Z + echo docker-image=308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5792688Z ##[group]Run set +e
2025-12-04T11:11:22.5792878Z [36;1mset +e[0m
2025-12-04T11:11:22.5792999Z [36;1mset -x[0m
2025-12-04T11:11:22.5793120Z [36;1m[0m
2025-12-04T11:11:22.5793229Z [36;1mlogin() {[0m
2025-12-04T11:11:22.5793461Z [36;1m  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1"[0m
2025-12-04T11:11:22.5793698Z [36;1m}[0m
2025-12-04T11:11:22.5793815Z [36;1m[0m
2025-12-04T11:11:22.5793923Z [36;1mretry () {[0m
2025-12-04T11:11:22.5794060Z [36;1m  $*  || (sleep 1 && $*) || (sleep 2 && $*)[0m
2025-12-04T11:11:22.5794209Z [36;1m}[0m
2025-12-04T11:11:22.5794339Z [36;1m[0m
2025-12-04T11:11:22.5794457Z [36;1mretry login "${DOCKER_REGISTRY}"[0m
2025-12-04T11:11:22.5794601Z [36;1m[0m
2025-12-04T11:11:22.5794876Z [36;1mSTART_TIME=$(date +%s)[0m
2025-12-04T11:11:22.5795025Z [36;1m# Wait up to 120 minutes[0m
2025-12-04T11:11:22.5795200Z [36;1mwhile [[ $(( $(date +%s) - 7200 )) -lt $START_TIME ]]; do[0m
2025-12-04T11:11:22.5795432Z [36;1m  # Check if image already exists, if it does then skip building it[0m
2025-12-04T11:11:22.5795650Z [36;1m  if docker manifest inspect "${DOCKER_IMAGE}"; then[0m
2025-12-04T11:11:22.5795821Z [36;1m    exit 0[0m
2025-12-04T11:11:22.5795940Z [36;1m  fi[0m
2025-12-04T11:11:22.5796050Z [36;1m[0m
2025-12-04T11:11:22.5796227Z [36;1m  # NB: This flag is used by Docker build workflow to push the image to ECR, so we can[0m
2025-12-04T11:11:22.5796518Z [36;1m  # use this to differentiate between the Docker build and regular build jobs. For the[0m
2025-12-04T11:11:22.5796806Z [36;1m  # latter, it will wait for the Docker images to become available before continuing[0m
2025-12-04T11:11:22.5797041Z [36;1m  if [ "${DOCKER_PUSH:-false}" == "true" ]; then[0m
2025-12-04T11:11:22.5797234Z [36;1m    # It's a Docker build job, let's build the image[0m
2025-12-04T11:11:22.5797396Z [36;1m    break[0m
2025-12-04T11:11:22.5797511Z [36;1m  else[0m
2025-12-04T11:11:22.5797666Z [36;1m    # It's a regular build job, wait for the image to become available[0m
2025-12-04T11:11:22.5797842Z [36;1m    sleep 300[0m
2025-12-04T11:11:22.5797958Z [36;1m  fi[0m
2025-12-04T11:11:22.5798057Z [36;1mdone[0m
2025-12-04T11:11:22.5798335Z [36;1m[0m
2025-12-04T11:11:22.5798484Z [36;1m# NB: This part requires a full checkout. Otherwise, the merge base will[0m
2025-12-04T11:11:22.5798706Z [36;1m# be empty.  The default action would be to continue rebuild the image[0m
2025-12-04T11:11:22.5798908Z [36;1mif [[ "$BASE_REVISION" = "$(git rev-parse HEAD)" ]]; then[0m
2025-12-04T11:11:22.5799090Z [36;1m  # if we're on the base branch then use the parent commit[0m
2025-12-04T11:11:22.5799376Z [36;1m  MERGE_BASE=$(git rev-parse HEAD~)[0m
2025-12-04T11:11:22.5799516Z [36;1melse[0m
2025-12-04T11:11:22.5799654Z [36;1m  # otherwise we're on a PR, so use the most recent base commit[0m
2025-12-04T11:11:22.5799843Z [36;1m  MERGE_BASE=$(git merge-base HEAD "$BASE_REVISION")[0m
2025-12-04T11:11:22.5799989Z [36;1mfi[0m
2025-12-04T11:11:22.5800089Z [36;1m[0m
2025-12-04T11:11:22.5800194Z [36;1mif [[ -z "${MERGE_BASE}" ]]; then[0m
2025-12-04T11:11:22.5800342Z [36;1m  echo "rebuild=true" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5800480Z [36;1m[0m
2025-12-04T11:11:22.5800677Z [36;1m  echo "Finding merge base only works with full checkout, please set fetch-depth to 0, continuing ..."[0m
2025-12-04T11:11:22.5800885Z [36;1m  exit 0[0m
2025-12-04T11:11:22.5800986Z [36;1mfi[0m
2025-12-04T11:11:22.5801075Z [36;1m[0m
2025-12-04T11:11:22.5801207Z [36;1mif ! git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}"; then[0m
2025-12-04T11:11:22.5801474Z [36;1m  echo "Directory '${DOCKER_BUILD_DIR}' not found in commit $MERGE_BASE, you should rebase onto a more recent commit"[0m
2025-12-04T11:11:22.5801700Z [36;1m  exit 1[0m
2025-12-04T11:11:22.5801793Z [36;1mfi[0m
2025-12-04T11:11:22.5801888Z [36;1m[0m
2025-12-04T11:11:22.5802037Z [36;1mPREVIOUS_DOCKER_TAG=$(git rev-parse "${MERGE_BASE}:${DOCKER_BUILD_DIR}")[0m
2025-12-04T11:11:22.5802310Z [36;1m# If no image exists but the hash is the same as the previous hash then we should error out here[0m
2025-12-04T11:11:22.5802534Z [36;1mif [[ "${PREVIOUS_DOCKER_TAG}" == "${DOCKER_TAG}" ]]; then[0m
2025-12-04T11:11:22.5802797Z [36;1m  echo "WARNING: Something has gone wrong and the previous image isn't available for the merge-base of your branch"[0m
2025-12-04T11:11:22.5803079Z [36;1m  echo "         Will re-build docker image to store in local cache, TTS may be longer"[0m
2025-12-04T11:11:22.5803255Z [36;1mfi[0m
2025-12-04T11:11:22.5803346Z [36;1m[0m
2025-12-04T11:11:22.5803462Z [36;1mecho "rebuild=true" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:11:22.5808128Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:22.5808390Z env:
2025-12-04T11:11:22.5808492Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:22.5808636Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:22.5808818Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:22.5808989Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:22.5809503Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:22.5810008Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:22.5810129Z   AWS_REGION: us-east-1
2025-12-04T11:11:22.5810362Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:22.5810523Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:22.5812571Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:22.5812691Z   DOCKER_BUILD_DIR: .ci/docker
2025-12-04T11:11:22.5812835Z   BASE_REVISION: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:11:22.5813158Z   DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5813529Z   DOCKER_TAG: pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:22.5813771Z   DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.5813923Z   DOCKER_PUSH: 
2025-12-04T11:11:22.5814027Z ##[endgroup]
2025-12-04T11:11:22.5832270Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.5832467Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.5834922Z + aws ecr get-login-password --region us-east-1
2025-12-04T11:11:22.5835845Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:22.5836214Z /home/runner/_work/_temp/884c5434-78da-4dd3-af27-7ddeb9346173.sh: line 5: aws: command not found
2025-12-04T11:11:22.5935646Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:22.5943636Z + sleep 1
2025-12-04T11:11:23.5953529Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:23.5957808Z + aws ecr get-login-password --region us-east-1
2025-12-04T11:11:23.5958278Z /home/runner/_work/_temp/884c5434-78da-4dd3-af27-7ddeb9346173.sh: line 5: aws: command not found
2025-12-04T11:11:23.5958726Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:23.6071996Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:23.6086562Z + sleep 2
2025-12-04T11:11:25.6097178Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:25.6103142Z + aws ecr get-login-password --region us-east-1
2025-12-04T11:11:25.6103976Z /home/runner/_work/_temp/884c5434-78da-4dd3-af27-7ddeb9346173.sh: line 5: aws: command not found
2025-12-04T11:11:25.6104763Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:25.6211188Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:25.6225304Z ++ date +%s
2025-12-04T11:11:25.6236242Z + START_TIME=1764846685
2025-12-04T11:11:25.6241092Z ++ date +%s
2025-12-04T11:11:25.6251013Z + [[ 1764839485 -lt 1764846685 ]]
2025-12-04T11:11:25.6251585Z + docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:26.9741662Z {
2025-12-04T11:11:26.9742073Z 	"schemaVersion": 2,
2025-12-04T11:11:26.9742638Z 	"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
2025-12-04T11:11:26.9743130Z 	"config": {
2025-12-04T11:11:26.9743516Z 		"mediaType": "application/vnd.docker.container.image.v1+json",
2025-12-04T11:11:26.9744016Z 		"size": 30522,
2025-12-04T11:11:26.9744482Z 		"digest": "sha256:79498ef00fdf8abfcde955fd685c3a7412c33ca80383b5905abfdc3c70621215"
2025-12-04T11:11:26.9745731Z 	},
2025-12-04T11:11:26.9745964Z 	"layers": [
2025-12-04T11:11:26.9746194Z 		{
2025-12-04T11:11:26.9746556Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9746992Z 			"size": 30594402,
2025-12-04T11:11:26.9747448Z 			"digest": "sha256:02de03a7213b62b792ec66a7efb8c86c4117ca00fb8651facf8ecfe33044b485"
2025-12-04T11:11:26.9747804Z 		},
2025-12-04T11:11:26.9747956Z 		{
2025-12-04T11:11:26.9748458Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9748755Z 			"size": 1554,
2025-12-04T11:11:26.9749055Z 			"digest": "sha256:3a5718b5258e28918133dd74ea64bd506b2c15530a2fa8a72c45c5b0d8f7c7b0"
2025-12-04T11:11:26.9749384Z 		},
2025-12-04T11:11:26.9749529Z 		{
2025-12-04T11:11:26.9749773Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9750079Z 			"size": 335779211,
2025-12-04T11:11:26.9750400Z 			"digest": "sha256:bf3aa22776924a41b55849f0f30cb22af45d41da1177a9d682cf94cde99d8f98"
2025-12-04T11:11:26.9750738Z 		},
2025-12-04T11:11:26.9750887Z 		{
2025-12-04T11:11:26.9751129Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9751423Z 			"size": 704,
2025-12-04T11:11:26.9751717Z 			"digest": "sha256:9d58e5257cefd43e8226153d71d28a865253662146aa9fce9a9f95af67b497fa"
2025-12-04T11:11:26.9752038Z 		},
2025-12-04T11:11:26.9752185Z 		{
2025-12-04T11:11:26.9752423Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9752712Z 			"size": 1770,
2025-12-04T11:11:26.9753007Z 			"digest": "sha256:fde80a64553533a56c032d4bc388837e7d4631a0424d1bfe135703165b67fd4d"
2025-12-04T11:11:26.9753330Z 		},
2025-12-04T11:11:26.9753477Z 		{
2025-12-04T11:11:26.9753715Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9754186Z 			"size": 485,
2025-12-04T11:11:26.9754671Z 			"digest": "sha256:6931c5f20e80e481e4f484471ff3a02878b4f8c54a9a5a4717213fdaa35c0bff"
2025-12-04T11:11:26.9754994Z 		},
2025-12-04T11:11:26.9755147Z 		{
2025-12-04T11:11:26.9755385Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9755677Z 			"size": 120663474,
2025-12-04T11:11:26.9755993Z 			"digest": "sha256:170ea6d3edd62991e37d2e6ebe53dfcd4601f5d42e8f9720af5f8db5fc267856"
2025-12-04T11:11:26.9756323Z 		},
2025-12-04T11:11:26.9756471Z 		{
2025-12-04T11:11:26.9756710Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9756999Z 			"size": 4433,
2025-12-04T11:11:26.9757266Z 			"digest": "sha256:dc8487f6c81cac00fa33031f8d3481e2c3634c4f064a9c4c36b87b41e78bc9fb"
2025-12-04T11:11:26.9757507Z 		},
2025-12-04T11:11:26.9757616Z 		{
2025-12-04T11:11:26.9757791Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9758003Z 			"size": 1755,
2025-12-04T11:11:26.9758284Z 			"digest": "sha256:9748c5348f39a11c960c49fd9219fdea1c23e612ed11a02d71501424defc80f5"
2025-12-04T11:11:26.9758527Z 		},
2025-12-04T11:11:26.9758632Z 		{
2025-12-04T11:11:26.9758814Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9759027Z 			"size": 724,
2025-12-04T11:11:26.9759246Z 			"digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc"
2025-12-04T11:11:26.9759486Z 		},
2025-12-04T11:11:26.9759596Z 		{
2025-12-04T11:11:26.9759800Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9760016Z 			"size": 3378352584,
2025-12-04T11:11:26.9760250Z 			"digest": "sha256:af88f886884fe6f1a1992efb7ce8473901f795eef69caa199443f3e076fdfd5b"
2025-12-04T11:11:26.9760578Z 		},
2025-12-04T11:11:26.9760865Z 		{
2025-12-04T11:11:26.9761141Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9761356Z 			"size": 396,
2025-12-04T11:11:26.9761576Z 			"digest": "sha256:32fbb88555c4195c45c7008cf92e389d67acc79a7e382503003ef93bcb886afe"
2025-12-04T11:11:26.9761822Z 		},
2025-12-04T11:11:26.9761933Z 		{
2025-12-04T11:11:26.9762189Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9762417Z 			"size": 80171601,
2025-12-04T11:11:26.9762662Z 			"digest": "sha256:3231e1ab814b143b244037c540b637be259085834865ac43b1ed2b6f6ad631e1"
2025-12-04T11:11:26.9762898Z 		},
2025-12-04T11:11:26.9763010Z 		{
2025-12-04T11:11:26.9763187Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9763400Z 			"size": 787,
2025-12-04T11:11:26.9763623Z 			"digest": "sha256:80061bf5dcbb9a4e38ac865a9cdc0a615bb294e3e6bfa357a6d515dcf3f54abc"
2025-12-04T11:11:26.9763871Z 		},
2025-12-04T11:11:26.9763981Z 		{
2025-12-04T11:11:26.9764155Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9764367Z 			"size": 106,
2025-12-04T11:11:26.9764586Z 			"digest": "sha256:6e9524f4518ec02b47ff12c55b6b6afbc65b3f4be59072e2afe20c2c87522549"
2025-12-04T11:11:26.9764832Z 		},
2025-12-04T11:11:26.9764938Z 		{
2025-12-04T11:11:26.9765120Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9765355Z 			"size": 1495,
2025-12-04T11:11:26.9765572Z 			"digest": "sha256:ce919d4bf5eeff71d49b160a16603117225530497c3905e02224227d11e2ff88"
2025-12-04T11:11:26.9765810Z 		},
2025-12-04T11:11:26.9765920Z 		{
2025-12-04T11:11:26.9766095Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9766310Z 			"size": 548601195,
2025-12-04T11:11:26.9766534Z 			"digest": "sha256:47681e3e6f37423139a5c86549ffbb43e4f258344b0461208f5821263da152e9"
2025-12-04T11:11:26.9766769Z 		},
2025-12-04T11:11:26.9766877Z 		{
2025-12-04T11:11:26.9767052Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9767250Z 			"size": 162,
2025-12-04T11:11:26.9767427Z 			"digest": "sha256:cb70fe22c9ebacebfe8402519059c8a66da6d5a77979e4c0ecdb3a762bebe357"
2025-12-04T11:11:26.9767675Z 		},
2025-12-04T11:11:26.9767764Z 		{
2025-12-04T11:11:26.9767905Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9768083Z 			"size": 104,
2025-12-04T11:11:26.9768305Z 			"digest": "sha256:17858e829c8cfe9a7e22516e03ad5273d8cf5c50f58edb10ff60c74e15c8e1f6"
2025-12-04T11:11:26.9768498Z 		},
2025-12-04T11:11:26.9768588Z 		{
2025-12-04T11:11:26.9768727Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9768897Z 			"size": 724,
2025-12-04T11:11:26.9769072Z 			"digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc"
2025-12-04T11:11:26.9769263Z 		},
2025-12-04T11:11:26.9769354Z 		{
2025-12-04T11:11:26.9769496Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9769667Z 			"size": 196,
2025-12-04T11:11:26.9769843Z 			"digest": "sha256:a63f3b4eed1157bcb3c51b64196e74e9f10d1f923652b02fd433c6ed993597ff"
2025-12-04T11:11:26.9770038Z 		},
2025-12-04T11:11:26.9770130Z 		{
2025-12-04T11:11:26.9770277Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9770448Z 			"size": 2584,
2025-12-04T11:11:26.9770635Z 			"digest": "sha256:10ab3d1afbc4cb2d3ced8f3e0072c0b1dd124dcadcf68b95fadf8a7a9f663860"
2025-12-04T11:11:26.9770831Z 		},
2025-12-04T11:11:26.9770920Z 		{
2025-12-04T11:11:26.9771061Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9771234Z 			"size": 7652105336,
2025-12-04T11:11:26.9771418Z 			"digest": "sha256:98ca88b5095b449a2f2d753a21217856271912fbe51c2d99f928a2196f4097d5"
2025-12-04T11:11:26.9771609Z 		},
2025-12-04T11:11:26.9771698Z 		{
2025-12-04T11:11:26.9771841Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9772012Z 			"size": 135,
2025-12-04T11:11:26.9772184Z 			"digest": "sha256:025c90839a58c768b3cc444e48cae67c1a5b2c85320ad8827231f0ba390cf9aa"
2025-12-04T11:11:26.9772374Z 		},
2025-12-04T11:11:26.9772466Z 		{
2025-12-04T11:11:26.9772606Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9772780Z 			"size": 104,
2025-12-04T11:11:26.9773023Z 			"digest": "sha256:9255df5942ae69fee24f8074314f451d5d2f1ca71b6c777274297fd43a0032d8"
2025-12-04T11:11:26.9773212Z 		},
2025-12-04T11:11:26.9773303Z 		{
2025-12-04T11:11:26.9773442Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9773612Z 			"size": 612,
2025-12-04T11:11:26.9773788Z 			"digest": "sha256:f71ca9d4ed1c4ca8177602f3cb0db83d9787ea6c258a8ef203387b308ff3e0f0"
2025-12-04T11:11:26.9773980Z 		},
2025-12-04T11:11:26.9774067Z 		{
2025-12-04T11:11:26.9774206Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9774371Z 			"size": 838191953,
2025-12-04T11:11:26.9774552Z 			"digest": "sha256:d02b47b56ca7f3598f5943d4fdc7139d5e3d3bc82d49185cedf9817dd55fc75c"
2025-12-04T11:11:26.9774738Z 		},
2025-12-04T11:11:26.9774824Z 		{
2025-12-04T11:11:26.9774959Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9775126Z 			"size": 111,
2025-12-04T11:11:26.9775299Z 			"digest": "sha256:40279492aea7bc8fb650842b495912195621c21b14cef4c717a9e0a9fc535131"
2025-12-04T11:11:26.9775483Z 		},
2025-12-04T11:11:26.9775568Z 		{
2025-12-04T11:11:26.9775699Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9775864Z 			"size": 1556,
2025-12-04T11:11:26.9776035Z 			"digest": "sha256:33a27ce74abd7e32a03a564fc45005bc75904b53ad516f18d47facbeb2f2794e"
2025-12-04T11:11:26.9776225Z 		},
2025-12-04T11:11:26.9776311Z 		{
2025-12-04T11:11:26.9776452Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9776622Z 			"size": 107,
2025-12-04T11:11:26.9776795Z 			"digest": "sha256:6b66ed335d1d8df6140caba76d9c2babed83bb37962e1e638825d49e67184fa5"
2025-12-04T11:11:26.9776985Z 		},
2025-12-04T11:11:26.9777074Z 		{
2025-12-04T11:11:26.9777210Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9777411Z 			"size": 166,
2025-12-04T11:11:26.9777573Z 			"digest": "sha256:9f010fa04118bfee2d7b4481e6badb714032bde0652b04151a6599e57e1bd91b"
2025-12-04T11:11:26.9777751Z 		},
2025-12-04T11:11:26.9777842Z 		{
2025-12-04T11:11:26.9777973Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9778132Z 			"size": 3702493,
2025-12-04T11:11:26.9778408Z 			"digest": "sha256:6c64d5e8bb6ae6ef4e3f1d316429d8b14a6e8a1fb410fb83b96c8bbd4a0a095c"
2025-12-04T11:11:26.9778590Z 		},
2025-12-04T11:11:26.9778674Z 		{
2025-12-04T11:11:26.9778804Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9778962Z 			"size": 107,
2025-12-04T11:11:26.9779130Z 			"digest": "sha256:c20ea058f549f5f5538c95c5e0da23afbbc9fb7ffc1987d126fe684eeed743f5"
2025-12-04T11:11:26.9779314Z 		},
2025-12-04T11:11:26.9779399Z 		{
2025-12-04T11:11:26.9779530Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9779691Z 			"size": 829,
2025-12-04T11:11:26.9779855Z 			"digest": "sha256:3c4fd2d54638a1336d39769fe36041aa6d186a8dea0e7096b8d8a7068ba0d3c0"
2025-12-04T11:11:26.9780034Z 		},
2025-12-04T11:11:26.9780117Z 		{
2025-12-04T11:11:26.9780249Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9780407Z 			"size": 26673844,
2025-12-04T11:11:26.9780575Z 			"digest": "sha256:964ebac3d7a95c64ea7f0d828cd58e6244cc955e9a099a2525079ecf64026e3f"
2025-12-04T11:11:26.9780753Z 		},
2025-12-04T11:11:26.9780831Z 		{
2025-12-04T11:11:26.9780960Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9781119Z 			"size": 104,
2025-12-04T11:11:26.9781284Z 			"digest": "sha256:2aaa7210673fc5bd15d36e54ee5c3fb495d1eafa1cb8d686054ccedb1c37bfc8"
2025-12-04T11:11:26.9781468Z 		},
2025-12-04T11:11:26.9781552Z 		{
2025-12-04T11:11:26.9781682Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9781843Z 			"size": 424,
2025-12-04T11:11:26.9782005Z 			"digest": "sha256:fa273daa00371a98ed668535e14b8cc3cb425feba0b601b3e3c72314d0234312"
2025-12-04T11:11:26.9782190Z 		},
2025-12-04T11:11:26.9782275Z 		{
2025-12-04T11:11:26.9782452Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9782612Z 			"size": 19279582,
2025-12-04T11:11:26.9782784Z 			"digest": "sha256:d931a62fd2408369decfa0e6eac11768e35d0ffddee87d769c82aaf1ad7e2899"
2025-12-04T11:11:26.9782966Z 		},
2025-12-04T11:11:26.9783050Z 		{
2025-12-04T11:11:26.9783181Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9783342Z 			"size": 826,
2025-12-04T11:11:26.9783504Z 			"digest": "sha256:d3573d61c28e1400840260d3c2c786c9e104f6558162beac799e55b6f5c1e747"
2025-12-04T11:11:26.9783677Z 		},
2025-12-04T11:11:26.9783761Z 		{
2025-12-04T11:11:26.9783893Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9784051Z 			"size": 724,
2025-12-04T11:11:26.9784213Z 			"digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc"
2025-12-04T11:11:26.9784395Z 		},
2025-12-04T11:11:26.9784481Z 		{
2025-12-04T11:11:26.9784611Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9784778Z 			"size": 149,
2025-12-04T11:11:26.9784939Z 			"digest": "sha256:f9b32f08c49055dd61bd359d5f42f6adb9e5a183c2821d97d11572dd7ce1e91f"
2025-12-04T11:11:26.9785120Z 		},
2025-12-04T11:11:26.9785208Z 		{
2025-12-04T11:11:26.9785341Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9785500Z 			"size": 136,
2025-12-04T11:11:26.9785654Z 			"digest": "sha256:3a0206399d60f6e8897f78c8e8f81b59d51969a329ef45485d28ae19607ca72c"
2025-12-04T11:11:26.9785829Z 		},
2025-12-04T11:11:26.9785912Z 		{
2025-12-04T11:11:26.9786043Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9786200Z 			"size": 140,
2025-12-04T11:11:26.9786360Z 			"digest": "sha256:386f322edd1c1c275126bab065c22fcd3950916c1fb8491a21a7f5c358af599a"
2025-12-04T11:11:26.9786537Z 		},
2025-12-04T11:11:26.9786677Z 		{
2025-12-04T11:11:26.9786806Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9786963Z 			"size": 32,
2025-12-04T11:11:26.9787130Z 			"digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1"
2025-12-04T11:11:26.9787309Z 		},
2025-12-04T11:11:26.9787394Z 		{
2025-12-04T11:11:26.9787527Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9787686Z 			"size": 223,
2025-12-04T11:11:26.9787846Z 			"digest": "sha256:bbe49df30697f6959cd958299909d9255cd54663ce2e9e2c2d378f8f9dfe8345"
2025-12-04T11:11:26.9788025Z 		},
2025-12-04T11:11:26.9788109Z 		{
2025-12-04T11:11:26.9788279Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9788438Z 			"size": 346,
2025-12-04T11:11:26.9788598Z 			"digest": "sha256:d6630aa6f375b12cb7471c5b60eb32e02ff8d70adf4497e061d6c15fead186c7"
2025-12-04T11:11:26.9788782Z 		},
2025-12-04T11:11:26.9788866Z 		{
2025-12-04T11:11:26.9789007Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9789163Z 			"size": 88302,
2025-12-04T11:11:26.9789328Z 			"digest": "sha256:6d807afc1309592c99c7d77af3874afb54c1718377fe721ac0cc616f59d291b9"
2025-12-04T11:11:26.9789499Z 		},
2025-12-04T11:11:26.9789576Z 		{
2025-12-04T11:11:26.9789701Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9789854Z 			"size": 106,
2025-12-04T11:11:26.9790007Z 			"digest": "sha256:60b679430e4e0b7690392dfe4f5dc417847f7a3ba2b761ce747b66d412e1d956"
2025-12-04T11:11:26.9790178Z 		},
2025-12-04T11:11:26.9790257Z 		{
2025-12-04T11:11:26.9790380Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9790532Z 			"size": 1671,
2025-12-04T11:11:26.9790692Z 			"digest": "sha256:3992ae84f9eda1c5c52fa96b1f1d0fc3f93c661c5cf0b971a504a260c290da49"
2025-12-04T11:11:26.9790865Z 		},
2025-12-04T11:11:26.9790943Z 		{
2025-12-04T11:11:26.9791069Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9791224Z 			"size": 724,
2025-12-04T11:11:26.9791423Z 			"digest": "sha256:8539cc3f8d8a138501ed0255c0cd7ec491bc0add9e4a62095f1c0f9533daa1cc"
2025-12-04T11:11:26.9791598Z 		},
2025-12-04T11:11:26.9791675Z 		{
2025-12-04T11:11:26.9791800Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9791952Z 			"size": 138,
2025-12-04T11:11:26.9792110Z 			"digest": "sha256:62d400609f9c38fce4745f72372423072ba0f142b3c03775ccb317f6c5240966"
2025-12-04T11:11:26.9792279Z 		},
2025-12-04T11:11:26.9792356Z 		{
2025-12-04T11:11:26.9792486Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9792643Z 			"size": 119,
2025-12-04T11:11:26.9792801Z 			"digest": "sha256:7e7b097490967d568331cc9f8afdd02422fe101c6364ec5e12dba2970991e533"
2025-12-04T11:11:26.9793062Z 		},
2025-12-04T11:11:26.9793179Z 		{
2025-12-04T11:11:26.9793356Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9807092Z 			"size": 6231259865,
2025-12-04T11:11:26.9807293Z 			"digest": "sha256:7dcdbd8421cb17aaa5d0cb965ddf94e196cb364e762b12ab78024cb25e3b6bcd"
2025-12-04T11:11:26.9807487Z 		},
2025-12-04T11:11:26.9807576Z 		{
2025-12-04T11:11:26.9807719Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9807884Z 			"size": 174,
2025-12-04T11:11:26.9808049Z 			"digest": "sha256:cbb12613719bab9f179968227f9fb8881251992804e460b9a9e1c00f3ac4a0c5"
2025-12-04T11:11:26.9808277Z 		},
2025-12-04T11:11:26.9808365Z 		{
2025-12-04T11:11:26.9808498Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9808659Z 			"size": 1896,
2025-12-04T11:11:26.9808825Z 			"digest": "sha256:e87038dce9bc8e13bd64006847d30ddcaf77455256c4985fccfec83f82d4b925"
2025-12-04T11:11:26.9809004Z 		},
2025-12-04T11:11:26.9809088Z 		{
2025-12-04T11:11:26.9809222Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9809384Z 			"size": 162783968,
2025-12-04T11:11:26.9809627Z 			"digest": "sha256:e4606b636f96f1c80f4be26aeb9d6f5f990f6149789c2de160451c5ac76a467d"
2025-12-04T11:11:26.9809806Z 		},
2025-12-04T11:11:26.9809889Z 		{
2025-12-04T11:11:26.9810021Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9810181Z 			"size": 302,
2025-12-04T11:11:26.9810342Z 			"digest": "sha256:6f2a5d33b946e561219b9968769773e36ce1d28bee8c62eff652098b7825fc79"
2025-12-04T11:11:26.9810518Z 		},
2025-12-04T11:11:26.9810602Z 		{
2025-12-04T11:11:26.9810734Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9810892Z 			"size": 32,
2025-12-04T11:11:26.9811056Z 			"digest": "sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1"
2025-12-04T11:11:26.9811237Z 		},
2025-12-04T11:11:26.9811319Z 		{
2025-12-04T11:11:26.9811448Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9811605Z 			"size": 108,
2025-12-04T11:11:26.9811765Z 			"digest": "sha256:a4f2bf2f19e63b91d46f2d9cf11a25c657517a6835996404da1e79a09d918b0e"
2025-12-04T11:11:26.9811947Z 		},
2025-12-04T11:11:26.9812029Z 		{
2025-12-04T11:11:26.9812162Z 			"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
2025-12-04T11:11:26.9812324Z 			"size": 54145661,
2025-12-04T11:11:26.9812494Z 			"digest": "sha256:1ae00acdac56cbc6d3f81b3c5d854a2b77c30d458b0fbe18c5935145364484f0"
2025-12-04T11:11:26.9812678Z 		}
2025-12-04T11:11:26.9812763Z 	]
2025-12-04T11:11:26.9812848Z }
2025-12-04T11:11:26.9812944Z + exit 0
2025-12-04T11:11:26.9830545Z ##[group]Run set -eux
2025-12-04T11:11:26.9830681Z [36;1mset -eux[0m
2025-12-04T11:11:26.9830851Z [36;1m# It's ok if this steps fails, it would then be an anonymous user like what we used to have[0m
2025-12-04T11:11:26.9831275Z [36;1maws secretsmanager get-secret-value --secret-id docker_hub_readonly_token | jq --raw-output '.SecretString' | jq -r .docker_hub_readonly_token | docker login --username pytorchbot --password-stdin || true[0m
2025-12-04T11:11:26.9836039Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:26.9836193Z env:
2025-12-04T11:11:26.9836292Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:26.9836486Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:26.9836667Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:26.9836839Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:26.9837341Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:26.9837833Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:26.9837955Z   AWS_REGION: us-east-1
2025-12-04T11:11:26.9838234Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:26.9838392Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:26.9840438Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:26.9840549Z ##[endgroup]
2025-12-04T11:11:26.9868229Z + aws secretsmanager get-secret-value --secret-id docker_hub_readonly_token
2025-12-04T11:11:26.9868808Z /home/runner/_work/_temp/dbe5e399-4f52-4f96-b484-e9ecd25a675b.sh: line 3: aws: command not found
2025-12-04T11:11:26.9869217Z + jq --raw-output .SecretString
2025-12-04T11:11:26.9869547Z + jq -r .docker_hub_readonly_token
2025-12-04T11:11:26.9872453Z + docker login --username pytorchbot --password-stdin
2025-12-04T11:11:26.9984421Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:26.9992644Z + true
2025-12-04T11:11:27.0056739Z ##[group]Run pytorch/test-infra/.github/actions/pull-docker-image@main
2025-12-04T11:11:27.0056940Z with:
2025-12-04T11:11:27.0057224Z   docker-image: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:27.0057563Z   docker-registry: 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:27.0057880Z env:
2025-12-04T11:11:27.0057987Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:27.0058134Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:27.0058379Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:27.0058555Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:27.0059091Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:27.0059592Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:27.0059717Z   AWS_REGION: us-east-1
2025-12-04T11:11:27.0059947Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:27.0060110Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:27.0062143Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:27.0062263Z ##[endgroup]
2025-12-04T11:11:27.0069252Z ##[group]Run set -x
2025-12-04T11:11:27.0069381Z [36;1mset -x[0m
2025-12-04T11:11:27.0069485Z [36;1mset +e[0m
2025-12-04T11:11:27.0069587Z [36;1m[0m
2025-12-04T11:11:27.0069706Z [36;1mlogin() {[0m
2025-12-04T11:11:27.0069901Z [36;1m  aws ecr get-login-password --region us-east-1 | docker login -u AWS --password-stdin "$1"[0m
2025-12-04T11:11:27.0070100Z [36;1m}[0m
2025-12-04T11:11:27.0070192Z [36;1m[0m
2025-12-04T11:11:27.0070286Z [36;1mretry () {[0m
2025-12-04T11:11:27.0070410Z [36;1m  $*  || (sleep 1 && $*) || (sleep 2 && $*)[0m
2025-12-04T11:11:27.0070546Z [36;1m}[0m
2025-12-04T11:11:27.0070638Z [36;1m[0m
2025-12-04T11:11:27.0070740Z [36;1mretry login "${DOCKER_REGISTRY}"[0m
2025-12-04T11:11:27.0070868Z [36;1m[0m
2025-12-04T11:11:27.0071057Z [36;1mIMAGE_SIZE=$(docker manifest inspect "${DOCKER_IMAGE}" | jq '[.layers[].size, .config.size] | add / 1024 / 1024')[0m
2025-12-04T11:11:27.0071305Z [36;1mecho "Compressed size of image in MB: ${IMAGE_SIZE}"[0m
2025-12-04T11:11:27.0071458Z [36;1m[0m
2025-12-04T11:11:27.0071550Z [36;1mset -e[0m
2025-12-04T11:11:27.0071691Z [36;1m# ignore output since only exit code is used for conditional[0m
2025-12-04T11:11:27.0071881Z [36;1m# only pull docker image if it's not available locally[0m
2025-12-04T11:11:27.0072092Z [36;1mif ! docker inspect --type=image "${DOCKER_IMAGE}" >/dev/null 2>/dev/null; then[0m
2025-12-04T11:11:27.0072287Z [36;1m  retry docker pull "${DOCKER_IMAGE}"[0m
2025-12-04T11:11:27.0072418Z [36;1mfi[0m
2025-12-04T11:11:27.0076715Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:27.0076871Z env:
2025-12-04T11:11:27.0076969Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:27.0077110Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:27.0077292Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:27.0077464Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:27.0077973Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:27.0078519Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:27.0078644Z   AWS_REGION: us-east-1
2025-12-04T11:11:27.0078788Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:27.0078948Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:27.0080954Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:27.0081328Z   DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:27.0081654Z   DOCKER_REGISTRY: 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:27.0081811Z ##[endgroup]
2025-12-04T11:11:27.0102697Z + set +e
2025-12-04T11:11:27.0102869Z + retry login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:27.0103133Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:27.0106694Z + aws ecr get-login-password --region us-east-1
2025-12-04T11:11:27.0107184Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:27.0107470Z /home/runner/_work/_temp/01919194-d03e-4bd1-9aa7-72be92403208.sh: line 5: aws: command not found
2025-12-04T11:11:27.0215815Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:27.0223787Z + sleep 1
2025-12-04T11:11:28.0233607Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:28.0237669Z + aws ecr get-login-password --region us-east-1
2025-12-04T11:11:28.0238514Z /home/runner/_work/_temp/01919194-d03e-4bd1-9aa7-72be92403208.sh: line 5: aws: command not found
2025-12-04T11:11:28.0239393Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:28.0351281Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:28.0363446Z + sleep 2
2025-12-04T11:11:30.0379146Z + login 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:30.0381474Z + aws ecr get-login-password --region us-east-1
2025-12-04T11:11:30.0381943Z /home/runner/_work/_temp/01919194-d03e-4bd1-9aa7-72be92403208.sh: line 5: aws: command not found
2025-12-04T11:11:30.0383570Z + docker login -u AWS --password-stdin 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T11:11:30.0491324Z Error: Cannot perform an interactive login from a non TTY device
2025-12-04T11:11:30.0511618Z ++ docker manifest inspect 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:30.0512369Z ++ jq '[.layers[].size, .config.size] | add / 1024 / 1024'
2025-12-04T11:11:31.3943208Z + IMAGE_SIZE=18579.916069984436
2025-12-04T11:11:31.3943484Z + echo 'Compressed size of image in MB: 18579.916069984436'
2025-12-04T11:11:31.3943668Z + set -e
2025-12-04T11:11:31.3944049Z + docker inspect --type=image 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:11:31.3944392Z Compressed size of image in MB: 18579.916069984436
2025-12-04T11:11:31.4112805Z Prepare all required actions
2025-12-04T11:11:31.4128077Z ##[group]Run ./.github/actions/get-workflow-job-id
2025-12-04T11:11:31.4128284Z with:
2025-12-04T11:11:31.4128591Z   github-token: ***
2025-12-04T11:11:31.4128690Z env:
2025-12-04T11:11:31.4128784Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:31.4128924Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:31.4129109Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:31.4129278Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:31.4129779Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:31.4130273Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:31.4130404Z   AWS_REGION: us-east-1
2025-12-04T11:11:31.4130583Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:31.4130737Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:31.4132761Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:31.4132867Z ##[endgroup]
2025-12-04T11:11:31.4139962Z ##[group]Run set -eux
2025-12-04T11:11:31.4140084Z [36;1mset -eux[0m
2025-12-04T11:11:31.4140256Z [36;1mpython3 .github/scripts/get_workflow_job_id.py "${GITHUB_RUN_ID}" "${RUNNER_NAME}"[0m
2025-12-04T11:11:31.4144777Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:11:31.4144922Z env:
2025-12-04T11:11:31.4145019Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:31.4145158Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:31.4145336Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:31.4145627Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:31.4146130Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:31.4146619Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:31.4146737Z   AWS_REGION: us-east-1
2025-12-04T11:11:31.4146891Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:31.4147054Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:31.4149097Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:31.4149267Z   GITHUB_TOKEN: ***
2025-12-04T11:11:31.4149365Z ##[endgroup]
2025-12-04T11:11:31.4168533Z + python3 .github/scripts/get_workflow_job_id.py 19922798714 linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv
2025-12-04T11:11:32.1324740Z Setting output job-id=57117547540
2025-12-04T11:11:32.1325163Z Setting output job-name=linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:11:32.1438009Z Prepare all required actions
2025-12-04T11:11:32.1438282Z Getting action download info
2025-12-04T11:11:32.3686402Z Download action repository 'seemethere/download-artifact-s3@v4' (SHA:1da556a7aa0a088e3153970611f6c432d58e80e6)
2025-12-04T11:11:33.2265647Z Download action repository 'actions/download-artifact@v4' (SHA:d3f86a106a0bac45b974a628896c90dbdf5c8093)
2025-12-04T11:11:34.0710761Z ##[group]Run ./.github/actions/download-build-artifacts
2025-12-04T11:11:34.0710927Z with:
2025-12-04T11:11:34.0711037Z   name: linux-noble-rocm-py3.12-mi300
2025-12-04T11:11:34.0711168Z   s3-bucket: gha-artifacts
2025-12-04T11:11:34.0711279Z env:
2025-12-04T11:11:34.0711377Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:34.0711513Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:34.0711707Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:34.0711872Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:34.0712399Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:34.0712892Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:34.0713006Z   AWS_REGION: us-east-1
2025-12-04T11:11:34.0713185Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:34.0713334Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:34.0715350Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:34.0715454Z ##[endgroup]
2025-12-04T11:11:34.0730396Z ##[group]Run seemethere/download-artifact-s3@v4
2025-12-04T11:11:34.0730530Z with:
2025-12-04T11:11:34.0730634Z   name: linux-noble-rocm-py3.12-mi300
2025-12-04T11:11:34.0730763Z   s3-bucket: gha-artifacts
2025-12-04T11:11:34.0730879Z   region: us-east-1
2025-12-04T11:11:34.0730974Z env:
2025-12-04T11:11:34.0731065Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:11:34.0731205Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:11:34.0731387Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:11:34.0731558Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:11:34.0732069Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:11:34.0732565Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:11:34.0732687Z   AWS_REGION: us-east-1
2025-12-04T11:11:34.0732822Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:11:34.0732971Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:11:34.0735100Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:11:34.0735199Z ##[endgroup]
2025-12-04T11:11:34.3064318Z (node:20336) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023.
2025-12-04T11:11:34.3064548Z 
2025-12-04T11:11:34.3065538Z Please migrate your code to use AWS SDK for JavaScript (v3).
2025-12-04T11:11:34.3066081Z For more information, check the migration guide at https://a.co/7PzMCcy
2025-12-04T11:11:34.3066530Z (Use `node --trace-warnings ...` to show where the warning was created)
2025-12-04T11:11:34.5823670Z Found 1 objects with prefix pytorch/pytorch/19922798714/linux-noble-rocm-py3.12-mi300/
2025-12-04T11:11:34.5824386Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip
2025-12-04T11:12:10.1981185Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/artifacts.zip
2025-12-04T11:12:10.1985102Z Artifact download has finished successfully
2025-12-04T11:12:10.2337806Z ##[group]Run unzip -o artifacts.zip
2025-12-04T11:12:10.2337985Z [36;1munzip -o artifacts.zip[0m
2025-12-04T11:12:10.2342803Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:10.2342977Z env:
2025-12-04T11:12:10.2343294Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:10.2343452Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:10.2343659Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:10.2343854Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:10.2344451Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:10.2345030Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:10.2345166Z   AWS_REGION: us-east-1
2025-12-04T11:12:10.2345345Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:10.2345522Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:10.2347880Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:10.2347989Z ##[endgroup]
2025-12-04T11:12:10.2384312Z Archive:  artifacts.zip
2025-12-04T11:12:10.2385732Z    creating: dist/
2025-12-04T11:12:13.1724833Z   inflating: dist/torch-2.10.0a0+gitffd9b0f-cp312-cp312-linux_x86_64.whl  
2025-12-04T11:12:13.1804110Z   inflating: dist/.ninja_log         
2025-12-04T11:12:13.1804409Z    creating: build/custom_test_artifacts/
2025-12-04T11:12:13.1808879Z    creating: build/custom_test_artifacts/custom-op-build/
2025-12-04T11:12:13.1809399Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/
2025-12-04T11:12:13.1809964Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/pkgRedirects/
2025-12-04T11:12:13.1810577Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeConfigureLog.yaml  
2025-12-04T11:12:13.1811171Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/
2025-12-04T11:12:13.1811761Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeSystem.cmake  
2025-12-04T11:12:13.1812434Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/
2025-12-04T11:12:13.1813042Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/tmp/
2025-12-04T11:12:13.1813745Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c  
2025-12-04T11:12:13.1814446Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdC/a.out  
2025-12-04T11:12:13.1815103Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake  
2025-12-04T11:12:13.1815739Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/
2025-12-04T11:12:13.1816354Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/
2025-12-04T11:12:13.1816906Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp  
2025-12-04T11:12:13.1818139Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out  
2025-12-04T11:12:13.1818678Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake  
2025-12-04T11:12:13.1819211Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin  
2025-12-04T11:12:13.1819778Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin  
2025-12-04T11:12:13.1820265Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeScratch/
2025-12-04T11:12:13.1820657Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeTmp/
2025-12-04T11:12:13.1821069Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/cmake.check_cache  
2025-12-04T11:12:13.1821604Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/
2025-12-04T11:12:13.1822085Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.ts  
2025-12-04T11:12:13.1822840Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/compiler_depend.make  
2025-12-04T11:12:13.1823347Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/depend.make  
2025-12-04T11:12:13.1823826Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.txt  
2025-12-04T11:12:13.1824309Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/cmake_clean.cmake  
2025-12-04T11:12:13.1824803Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/build.make  
2025-12-04T11:12:13.1825299Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/DependInfo.cmake  
2025-12-04T11:12:13.1825784Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/flags.make  
2025-12-04T11:12:13.1826283Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/progress.make  
2025-12-04T11:12:13.1830746Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o.d  
2025-12-04T11:12:13.1947134Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/op.cpp.o  
2025-12-04T11:12:13.1947495Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/custom_ops.dir/link.d  
2025-12-04T11:12:13.1947920Z    creating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/
2025-12-04T11:12:13.1948335Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.ts  
2025-12-04T11:12:13.1948745Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/compiler_depend.make  
2025-12-04T11:12:13.1949123Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/depend.make  
2025-12-04T11:12:13.1949485Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.txt  
2025-12-04T11:12:13.1949863Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/cmake_clean.cmake  
2025-12-04T11:12:13.1950240Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/build.make  
2025-12-04T11:12:13.1950605Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/DependInfo.cmake  
2025-12-04T11:12:13.1950975Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/flags.make  
2025-12-04T11:12:13.1951337Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/progress.make  
2025-12-04T11:12:13.1962281Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o.d  
2025-12-04T11:12:13.2009807Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/test_custom_ops.cpp.o  
2025-12-04T11:12:13.2010257Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/test_custom_ops.dir/link.d  
2025-12-04T11:12:13.2010603Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/CMakeDirectoryInformation.cmake  
2025-12-04T11:12:13.2010927Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/TargetDirectories.txt  
2025-12-04T11:12:13.2011227Z  extracting: build/custom_test_artifacts/custom-op-build/CMakeFiles/progress.marks  
2025-12-04T11:12:13.2011502Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile2  
2025-12-04T11:12:13.2011861Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeFiles/Makefile.cmake  
2025-12-04T11:12:13.2012356Z   inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_outer_vec.cc  
2025-12-04T11:12:13.2012804Z   inflating: build/custom_test_artifacts/custom-op-build/hipblaslt_test_vec_ext.cc  
2025-12-04T11:12:13.2013349Z   inflating: build/custom_test_artifacts/custom-op-build/CMakeCache.txt  
2025-12-04T11:12:13.2013964Z   inflating: build/custom_test_artifacts/custom-op-build/Makefile  
2025-12-04T11:12:13.2014303Z   inflating: build/custom_test_artifacts/custom-op-build/cmake_install.cmake  
2025-12-04T11:12:13.2115622Z   inflating: build/custom_test_artifacts/custom-op-build/libcustom_ops.so  
2025-12-04T11:12:13.2149523Z   inflating: build/custom_test_artifacts/custom-op-build/test_custom_ops  
2025-12-04T11:12:13.2149830Z    creating: build/custom_test_artifacts/jit-hook-build/
2025-12-04T11:12:13.2150123Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/
2025-12-04T11:12:13.2150437Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/pkgRedirects/
2025-12-04T11:12:13.2152380Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeConfigureLog.yaml  
2025-12-04T11:12:13.2152735Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/
2025-12-04T11:12:13.2153082Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeSystem.cmake  
2025-12-04T11:12:13.2153461Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/
2025-12-04T11:12:13.2153820Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/tmp/
2025-12-04T11:12:13.2154433Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c  
2025-12-04T11:12:13.2155154Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdC/a.out  
2025-12-04T11:12:13.2155552Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake  
2025-12-04T11:12:13.2155925Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/
2025-12-04T11:12:13.2156288Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/
2025-12-04T11:12:13.2157327Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp  
2025-12-04T11:12:13.2157919Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out  
2025-12-04T11:12:13.2158409Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake  
2025-12-04T11:12:13.2159443Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin  
2025-12-04T11:12:13.2160071Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin  
2025-12-04T11:12:13.2160439Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeScratch/
2025-12-04T11:12:13.2160738Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeTmp/
2025-12-04T11:12:13.2161045Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/cmake.check_cache  
2025-12-04T11:12:13.2161372Z    creating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/
2025-12-04T11:12:13.2161828Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.ts  
2025-12-04T11:12:13.2162242Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/compiler_depend.make  
2025-12-04T11:12:13.2162632Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/depend.make  
2025-12-04T11:12:13.2162995Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.txt  
2025-12-04T11:12:13.2163374Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/cmake_clean.cmake  
2025-12-04T11:12:13.2163755Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/build.make  
2025-12-04T11:12:13.2164137Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/DependInfo.cmake  
2025-12-04T11:12:13.2164517Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/flags.make  
2025-12-04T11:12:13.2164939Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/progress.make  
2025-12-04T11:12:13.2175245Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o.d  
2025-12-04T11:12:13.2212152Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/test_jit_hooks.cpp.o  
2025-12-04T11:12:13.2212528Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/test_jit_hooks.dir/link.d  
2025-12-04T11:12:13.2213026Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/CMakeDirectoryInformation.cmake  
2025-12-04T11:12:13.2213324Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/TargetDirectories.txt  
2025-12-04T11:12:13.2213586Z  extracting: build/custom_test_artifacts/jit-hook-build/CMakeFiles/progress.marks  
2025-12-04T11:12:13.2213839Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile2  
2025-12-04T11:12:13.2214550Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeFiles/Makefile.cmake  
2025-12-04T11:12:13.2214805Z   inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_outer_vec.cc  
2025-12-04T11:12:13.2215051Z   inflating: build/custom_test_artifacts/jit-hook-build/hipblaslt_test_vec_ext.cc  
2025-12-04T11:12:13.2215852Z   inflating: build/custom_test_artifacts/jit-hook-build/CMakeCache.txt  
2025-12-04T11:12:13.2216156Z   inflating: build/custom_test_artifacts/jit-hook-build/Makefile  
2025-12-04T11:12:13.2216466Z   inflating: build/custom_test_artifacts/jit-hook-build/cmake_install.cmake  
2025-12-04T11:12:13.2239503Z   inflating: build/custom_test_artifacts/jit-hook-build/test_jit_hooks  
2025-12-04T11:12:13.2239722Z    creating: build/custom_test_artifacts/custom-backend-build/
2025-12-04T11:12:13.2239942Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/
2025-12-04T11:12:13.2240194Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/pkgRedirects/
2025-12-04T11:12:13.2242493Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeConfigureLog.yaml  
2025-12-04T11:12:13.2242768Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/
2025-12-04T11:12:13.2243033Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeSystem.cmake  
2025-12-04T11:12:13.2243320Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/
2025-12-04T11:12:13.2243599Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/tmp/
2025-12-04T11:12:13.2244571Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/CMakeCCompilerId.c  
2025-12-04T11:12:13.2245309Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdC/a.out  
2025-12-04T11:12:13.2245700Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCCompiler.cmake  
2025-12-04T11:12:13.2246004Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/
2025-12-04T11:12:13.2246287Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/tmp/
2025-12-04T11:12:13.2247362Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/CMakeCXXCompilerId.cpp  
2025-12-04T11:12:13.2248061Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CompilerIdCXX/a.out  
2025-12-04T11:12:13.2248506Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeCXXCompiler.cmake  
2025-12-04T11:12:13.2249445Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_C.bin  
2025-12-04T11:12:13.2250180Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/3.31.6/CMakeDetermineCompilerABI_CXX.bin  
2025-12-04T11:12:13.2250486Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeScratch/
2025-12-04T11:12:13.2250794Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeTmp/
2025-12-04T11:12:13.2251054Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/cmake.check_cache  
2025-12-04T11:12:13.2251321Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/
2025-12-04T11:12:13.2251623Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.ts  
2025-12-04T11:12:13.2251963Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/compiler_depend.make  
2025-12-04T11:12:13.2252287Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/depend.make  
2025-12-04T11:12:13.2252591Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.txt  
2025-12-04T11:12:13.2252905Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/cmake_clean.cmake  
2025-12-04T11:12:13.2253227Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/build.make  
2025-12-04T11:12:13.2253539Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/DependInfo.cmake  
2025-12-04T11:12:13.2253850Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/flags.make  
2025-12-04T11:12:13.2254161Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/progress.make  
2025-12-04T11:12:13.2255330Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o.d  
2025-12-04T11:12:13.2325268Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/custom_backend.cpp.o  
2025-12-04T11:12:13.2325594Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/custom_backend.dir/link.d  
2025-12-04T11:12:13.2325909Z    creating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/
2025-12-04T11:12:13.2326234Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.ts  
2025-12-04T11:12:13.2326595Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/compiler_depend.make  
2025-12-04T11:12:13.2326941Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/depend.make  
2025-12-04T11:12:13.2327269Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.txt  
2025-12-04T11:12:13.2327600Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/cmake_clean.cmake  
2025-12-04T11:12:13.2327936Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/build.make  
2025-12-04T11:12:13.2328361Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/DependInfo.cmake  
2025-12-04T11:12:13.2328700Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/flags.make  
2025-12-04T11:12:13.2329029Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/progress.make  
2025-12-04T11:12:13.2340095Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o.d  
2025-12-04T11:12:13.2372168Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/test_custom_backend.cpp.o  
2025-12-04T11:12:13.2372533Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/test_custom_backend.dir/link.d  
2025-12-04T11:12:13.2372866Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/CMakeDirectoryInformation.cmake  
2025-12-04T11:12:13.2373241Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/TargetDirectories.txt  
2025-12-04T11:12:13.2373519Z  extracting: build/custom_test_artifacts/custom-backend-build/CMakeFiles/progress.marks  
2025-12-04T11:12:13.2373786Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile2  
2025-12-04T11:12:13.2374421Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeFiles/Makefile.cmake  
2025-12-04T11:12:13.2374701Z   inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_outer_vec.cc  
2025-12-04T11:12:13.2374959Z   inflating: build/custom_test_artifacts/custom-backend-build/hipblaslt_test_vec_ext.cc  
2025-12-04T11:12:13.2375767Z   inflating: build/custom_test_artifacts/custom-backend-build/CMakeCache.txt  
2025-12-04T11:12:13.2376119Z   inflating: build/custom_test_artifacts/custom-backend-build/Makefile  
2025-12-04T11:12:13.2376473Z   inflating: build/custom_test_artifacts/custom-backend-build/cmake_install.cmake  
2025-12-04T11:12:13.2436706Z   inflating: build/custom_test_artifacts/custom-backend-build/libcustom_backend.so  
2025-12-04T11:12:13.2460315Z   inflating: build/custom_test_artifacts/custom-backend-build/test_custom_backend  
2025-12-04T11:12:13.2460533Z    creating: build/lib/
2025-12-04T11:12:13.2509838Z   inflating: build/lib/libprotobuf-lite.a  
2025-12-04T11:12:13.2775216Z   inflating: build/lib/libprotobuf.a  
2025-12-04T11:12:13.3076483Z   inflating: build/lib/libprotoc.a   
2025-12-04T11:12:13.3082308Z   inflating: build/lib/libpthreadpool.a  
2025-12-04T11:12:13.3086582Z   inflating: build/lib/libcpuinfo.a  
2025-12-04T11:12:13.3091105Z   inflating: build/lib/libcpuinfo_internals.a  
2025-12-04T11:12:13.3091572Z   inflating: build/lib/libclog.a     
2025-12-04T11:12:13.3103038Z   inflating: build/lib/libpytorch_qnnpack.a  
2025-12-04T11:12:13.3104163Z   inflating: build/lib/libnnpack_reference_layers.a  
2025-12-04T11:12:13.3216897Z   inflating: build/lib/libmicrokernels-prod.a  
2025-12-04T11:12:13.3227366Z   inflating: build/lib/libnnpack.a   
2025-12-04T11:12:13.3756875Z   inflating: build/lib/libmicrokernels-all.a  
2025-12-04T11:12:13.3798127Z   inflating: build/lib/libgtest.a    
2025-12-04T11:12:13.3808231Z   inflating: build/lib/libgmock.a    
2025-12-04T11:12:13.3808431Z   inflating: build/lib/libgtest_main.a  
2025-12-04T11:12:13.3808607Z   inflating: build/lib/libgmock_main.a  
2025-12-04T11:12:13.3863233Z   inflating: build/lib/libXNNPACK.a  
2025-12-04T11:12:13.3908695Z   inflating: build/lib/libbenchmark.a  
2025-12-04T11:12:13.3908915Z   inflating: build/lib/libbenchmark_main.a  
2025-12-04T11:12:13.3948903Z   inflating: build/lib/libasmjit.a   
2025-12-04T11:12:13.3949116Z   inflating: build/lib/libjitprofiling.a  
2025-12-04T11:12:13.3953798Z   inflating: build/lib/libittnotify.a  
2025-12-04T11:12:13.4646331Z   inflating: build/lib/libfbgemm.a   
2025-12-04T11:12:13.4664583Z   inflating: build/lib/libtensorpipe_uv.a  
2025-12-04T11:12:13.4988644Z   inflating: build/lib/libtensorpipe.a  
2025-12-04T11:12:13.5061173Z   inflating: build/lib/libgloo.a     
2025-12-04T11:12:13.5089055Z   inflating: build/lib/libonnx_proto.a  
2025-12-04T11:12:13.5335477Z   inflating: build/lib/libgloo_hip.a  
2025-12-04T11:12:13.5761591Z   inflating: build/lib/libonnx.a     
2025-12-04T11:12:14.1798338Z   inflating: build/lib/libdnnl.a     
2025-12-04T11:12:14.1809494Z   inflating: build/lib/libfmt.a      
2025-12-04T11:12:14.1996032Z   inflating: build/lib/libkineto.a   
2025-12-04T11:12:14.2066558Z   inflating: build/lib/libc10.so     
2025-12-04T11:12:14.2067083Z   inflating: build/lib/libtorch_global_deps.so  
2025-12-04T11:12:14.2067863Z   inflating: build/lib/libcaffe2_nvrtc.so  
2025-12-04T11:12:14.2094999Z   inflating: build/lib/libc10_hip.so  
2025-12-04T11:12:14.2380117Z   inflating: build/lib/libfbgemm_genai.a  
2025-12-04T11:12:16.0971281Z   inflating: build/lib/libtorch_cpu.so  
2025-12-04T11:12:16.0973629Z   inflating: build/lib/libshm.so     
2025-12-04T11:12:16.9508609Z   inflating: build/lib/libtorch_hip.so  
2025-12-04T11:12:16.9509086Z   inflating: build/lib/libtorch.so   
2025-12-04T11:12:16.9520725Z   inflating: build/lib/libjitbackend_test.so  
2025-12-04T11:12:16.9534755Z   inflating: build/lib/libbackend_with_compiler.so  
2025-12-04T11:12:16.9577585Z   inflating: build/lib/libtorchbind_test.so  
2025-12-04T11:12:16.9593429Z   inflating: build/lib/libaoti_custom_ops.so  
2025-12-04T11:12:17.1040321Z   inflating: build/lib/libtorch_python.so  
2025-12-04T11:12:17.1062247Z   inflating: build/lib/libnnapi_backend.so  
2025-12-04T11:12:17.1062444Z    creating: build/bin/
2025-12-04T11:12:17.1062596Z    creating: build/bin/CMakeFiles/
2025-12-04T11:12:17.1062774Z   inflating: build/bin/cmake_install.cmake  
2025-12-04T11:12:17.1062962Z   inflating: build/bin/CTestTestfile.cmake  
2025-12-04T11:12:17.1341180Z   inflating: build/bin/protoc-3.13.0.0  
2025-12-04T11:12:17.1619396Z   inflating: build/bin/protoc        
2025-12-04T11:12:17.1655390Z   inflating: build/bin/c10_AllocatorConfig_test  
2025-12-04T11:12:17.1689339Z   inflating: build/bin/c10_CompileTimeFunctionPointer_test  
2025-12-04T11:12:17.1724022Z   inflating: build/bin/c10_DeviceGuard_test  
2025-12-04T11:12:17.1758863Z   inflating: build/bin/c10_Device_test  
2025-12-04T11:12:17.1792121Z   inflating: build/bin/c10_StreamGuard_test  
2025-12-04T11:12:17.1828774Z   inflating: build/bin/c10_Scalar_test  
2025-12-04T11:12:17.1868787Z   inflating: build/bin/c10_DispatchKeySet_test  
2025-12-04T11:12:17.1905291Z   inflating: build/bin/c10_InlineDeviceGuard_test  
2025-12-04T11:12:17.1943515Z   inflating: build/bin/c10_SymInt_test  
2025-12-04T11:12:17.1982068Z   inflating: build/bin/c10_InlineStreamGuard_test  
2025-12-04T11:12:17.2019275Z   inflating: build/bin/c10_SizesAndStrides_test  
2025-12-04T11:12:17.2053088Z   inflating: build/bin/c10_ArrayRef_test  
2025-12-04T11:12:17.2099750Z   inflating: build/bin/c10_cow_test  
2025-12-04T11:12:17.2133166Z   inflating: build/bin/c10_ConstexprCrc_test  
2025-12-04T11:12:17.2166903Z   inflating: build/bin/c10_DeadlockDetection_test  
2025-12-04T11:12:17.2205278Z   inflating: build/bin/c10_Enumerate_test  
2025-12-04T11:12:17.2240944Z   inflating: build/bin/c10_IntrusiveList_test  
2025-12-04T11:12:17.2275533Z   inflating: build/bin/c10_Half_test  
2025-12-04T11:12:17.2311284Z   inflating: build/bin/c10_Bitset_test  
2025-12-04T11:12:17.2348987Z   inflating: build/bin/c10_LeftRight_test  
2025-12-04T11:12:17.2382796Z   inflating: build/bin/c10_Semaphore_test  
2025-12-04T11:12:17.2418928Z   inflating: build/bin/c10_NetworkFlow_test  
2025-12-04T11:12:17.2456297Z   inflating: build/bin/c10_ThreadLocal_test  
2025-12-04T11:12:17.2490522Z   inflating: build/bin/c10_Synchronized_test  
2025-12-04T11:12:17.2525572Z   inflating: build/bin/c10_TypeIndex_test  
2025-12-04T11:12:17.2560533Z   inflating: build/bin/c10_accumulate_test  
2025-12-04T11:12:17.2594175Z   inflating: build/bin/c10_error_test  
2025-12-04T11:12:17.2628393Z   inflating: build/bin/c10_bit_cast_test  
2025-12-04T11:12:17.2666067Z   inflating: build/bin/c10_bfloat16_test  
2025-12-04T11:12:17.2703375Z   inflating: build/bin/c10_complex_test  
2025-12-04T11:12:17.2738875Z   inflating: build/bin/c10_exception_test  
2025-12-04T11:12:17.2776883Z   inflating: build/bin/c10_complex_math_test  
2025-12-04T11:12:17.2811086Z   inflating: build/bin/c10_flags_test  
2025-12-04T11:12:17.2845223Z   inflating: build/bin/c10_generic_math_test  
2025-12-04T11:12:17.2879759Z   inflating: build/bin/c10_irange_test  
2025-12-04T11:12:17.2979810Z   inflating: build/bin/c10_intrusive_ptr_test  
2025-12-04T11:12:17.3016047Z   inflating: build/bin/c10_lazy_test  
2025-12-04T11:12:17.3054534Z   inflating: build/bin/c10_logging_test  
2025-12-04T11:12:17.3088447Z   inflating: build/bin/c10_nofatal_test  
2025-12-04T11:12:17.3138060Z   inflating: build/bin/c10_optional_test  
2025-12-04T11:12:17.3174039Z   inflating: build/bin/c10_registry_test  
2025-12-04T11:12:17.3215393Z   inflating: build/bin/c10_ordered_preserving_dict_test  
2025-12-04T11:12:17.3313195Z   inflating: build/bin/c10_small_vector_test  
2025-12-04T11:12:17.3348537Z   inflating: build/bin/c10_ssize_test  
2025-12-04T11:12:17.3386267Z   inflating: build/bin/c10_string_util_test  
2025-12-04T11:12:17.3419603Z   inflating: build/bin/c10_string_view_test  
2025-12-04T11:12:17.3449347Z   inflating: build/bin/c10_intrusive_ptr_benchmark  
2025-12-04T11:12:17.3483700Z   inflating: build/bin/c10_tempfile_test  
2025-12-04T11:12:17.3521629Z   inflating: build/bin/c10_typeid_test  
2025-12-04T11:12:17.3554921Z   inflating: build/bin/c10_hip_HIPAssertionsTest_1_var_test  
2025-12-04T11:12:17.3588253Z   inflating: build/bin/c10_hip_HIPAssertionsTest_catches_stream  
2025-12-04T11:12:17.3621641Z   inflating: build/bin/c10_hip_HIPAssertionsTest_catches_thread_and_block_and_device  
2025-12-04T11:12:17.3654871Z   inflating: build/bin/c10_hip_HIPAssertionsTest_from_2_processes  
2025-12-04T11:12:17.3688078Z   inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_blocks_and_threads  
2025-12-04T11:12:17.3721685Z   inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_multiple_blocks  
2025-12-04T11:12:17.3755207Z   inflating: build/bin/c10_hip_HIPAssertionsTest_multiple_writes_from_same_block  
2025-12-04T11:12:17.3790734Z   inflating: build/bin/c10_hip_HIPTest  
2025-12-04T11:12:17.4155459Z   inflating: build/bin/vec_test_all_types_DEFAULT  
2025-12-04T11:12:17.4529335Z   inflating: build/bin/vec_test_all_types_AVX512  
2025-12-04T11:12:17.4909341Z   inflating: build/bin/vec_test_all_types_AVX2  
2025-12-04T11:12:17.4972961Z   inflating: build/bin/test_aoti_abi_check  
2025-12-04T11:12:17.5006475Z   inflating: build/bin/test_vec_half_DEFAULT  
2025-12-04T11:12:17.5040399Z   inflating: build/bin/test_vec_half_AVX2  
2025-12-04T11:12:17.5074273Z   inflating: build/bin/test_vec_half_AVX512  
2025-12-04T11:12:17.5109720Z   inflating: build/bin/BackoffTest   
2025-12-04T11:12:17.5145637Z   inflating: build/bin/FileStoreTest  
2025-12-04T11:12:17.5183759Z   inflating: build/bin/TCPStoreTest  
2025-12-04T11:12:17.5220337Z   inflating: build/bin/HashStoreTest  
2025-12-04T11:12:17.5264950Z   inflating: build/bin/ProcessGroupGlooTest  
2025-12-04T11:12:17.5266680Z   inflating: build/bin/example_allreduce  
2025-12-04T11:12:17.5268739Z   inflating: build/bin/torch_shm_manager  
2025-12-04T11:12:17.5305227Z   inflating: build/bin/static_runtime_bench  
2025-12-04T11:12:17.5464742Z   inflating: build/bin/static_runtime_test  
2025-12-04T11:12:17.5513404Z   inflating: build/bin/Dict_test     
2025-12-04T11:12:17.5548505Z   inflating: build/bin/Dimname_test  
2025-12-04T11:12:17.5591979Z   inflating: build/bin/MaybeOwned_test  
2025-12-04T11:12:17.5630336Z   inflating: build/bin/NamedTensor_test  
2025-12-04T11:12:17.5669873Z   inflating: build/bin/apply_utils_test  
2025-12-04T11:12:17.5709287Z   inflating: build/bin/atest         
2025-12-04T11:12:17.5752127Z   inflating: build/bin/basic         
2025-12-04T11:12:17.5788845Z   inflating: build/bin/broadcast_test  
2025-12-04T11:12:17.5823215Z   inflating: build/bin/cpu_allocator_test  
2025-12-04T11:12:17.5862120Z   inflating: build/bin/cpu_generator_test  
2025-12-04T11:12:17.5897826Z   inflating: build/bin/cpu_profiling_allocator_test  
2025-12-04T11:12:17.5958594Z   inflating: build/bin/cpu_rng_test  
2025-12-04T11:12:17.5993556Z   inflating: build/bin/dlconvertor_test  
2025-12-04T11:12:17.6149113Z   inflating: build/bin/extension_backend_test  
2025-12-04T11:12:17.6186341Z   inflating: build/bin/half_test     
2025-12-04T11:12:17.6250432Z   inflating: build/bin/ivalue_test   
2025-12-04T11:12:17.6284382Z   inflating: build/bin/lazy_tensor_test  
2025-12-04T11:12:17.6320107Z   inflating: build/bin/math_kernel_test  
2025-12-04T11:12:17.6356068Z   inflating: build/bin/memory_format_test  
2025-12-04T11:12:17.6392375Z   inflating: build/bin/memory_overlapping_test  
2025-12-04T11:12:17.6427002Z   inflating: build/bin/operator_name_test  
2025-12-04T11:12:17.6463021Z   inflating: build/bin/mobile_memory_cleanup  
2025-12-04T11:12:17.6500525Z   inflating: build/bin/native_test   
2025-12-04T11:12:17.6535925Z   inflating: build/bin/packedtensoraccessor_test  
2025-12-04T11:12:17.6570491Z   inflating: build/bin/operators_test  
2025-12-04T11:12:17.6619474Z   inflating: build/bin/pow_test      
2025-12-04T11:12:17.6657553Z   inflating: build/bin/quantized_test  
2025-12-04T11:12:17.6692155Z   inflating: build/bin/reportMemoryUsage_test  
2025-12-04T11:12:17.6730047Z   inflating: build/bin/reduce_ops_test  
2025-12-04T11:12:17.6764842Z   inflating: build/bin/StorageUtils_test  
2025-12-04T11:12:17.6803402Z   inflating: build/bin/scalar_test   
2025-12-04T11:12:17.6841279Z   inflating: build/bin/scalar_tensor_test  
2025-12-04T11:12:17.6879596Z   inflating: build/bin/stride_properties_test  
2025-12-04T11:12:17.6931950Z   inflating: build/bin/tensor_iterator_test  
2025-12-04T11:12:17.6969073Z   inflating: build/bin/test_parallel  
2025-12-04T11:12:17.7006433Z   inflating: build/bin/type_ptr_test  
2025-12-04T11:12:17.7045070Z   inflating: build/bin/thread_init_test  
2025-12-04T11:12:17.7080657Z   inflating: build/bin/undefined_tensor_test  
2025-12-04T11:12:17.7120392Z   inflating: build/bin/type_test     
2025-12-04T11:12:17.7154001Z   inflating: build/bin/verify_api_visibility  
2025-12-04T11:12:17.7189097Z   inflating: build/bin/weakref_test  
2025-12-04T11:12:17.7236617Z   inflating: build/bin/legacy_vmap_test  
2025-12-04T11:12:17.7271858Z   inflating: build/bin/wrapdim_test  
2025-12-04T11:12:17.7312172Z   inflating: build/bin/IListRef_test  
2025-12-04T11:12:17.7346865Z   inflating: build/bin/xla_tensor_test  
2025-12-04T11:12:17.7415386Z   inflating: build/bin/List_test     
2025-12-04T11:12:17.7493929Z   inflating: build/bin/kernel_function_legacy_test  
2025-12-04T11:12:17.7556546Z   inflating: build/bin/kernel_function_test  
2025-12-04T11:12:17.7600702Z   inflating: build/bin/KernelFunction_test  
2025-12-04T11:12:17.7682061Z   inflating: build/bin/kernel_lambda_legacy_test  
2025-12-04T11:12:17.7748408Z   inflating: build/bin/kernel_lambda_test  
2025-12-04T11:12:17.7811111Z   inflating: build/bin/make_boxed_from_unboxed_functor_test  
2025-12-04T11:12:17.7851372Z   inflating: build/bin/kernel_stackbased_test  
2025-12-04T11:12:17.7886010Z   inflating: build/bin/CppSignature_test  
2025-12-04T11:12:17.7919500Z   inflating: build/bin/op_allowlist_test  
2025-12-04T11:12:17.8113881Z   inflating: build/bin/op_registration_test  
2025-12-04T11:12:17.8147053Z   inflating: build/bin/hip_complex_math_test  
2025-12-04T11:12:17.8191634Z   inflating: build/bin/inline_container_test  
2025-12-04T11:12:17.8228745Z   inflating: build/bin/backend_fallback_test  
2025-12-04T11:12:17.8264650Z   inflating: build/bin/hip_apply_test  
2025-12-04T11:12:17.8298198Z   inflating: build/bin/hip_complex_test  
2025-12-04T11:12:17.8331496Z   inflating: build/bin/hip_distributions_test  
2025-12-04T11:12:17.8364704Z   inflating: build/bin/hip_generator_test  
2025-12-04T11:12:17.8397864Z   inflating: build/bin/hip_half_test  
2025-12-04T11:12:17.8431080Z   inflating: build/bin/hip_integer_divider_test  
2025-12-04T11:12:17.8464312Z   inflating: build/bin/hip_optional_test  
2025-12-04T11:12:17.8497667Z   inflating: build/bin/hip_packedtensoraccessor_test  
2025-12-04T11:12:17.8533044Z   inflating: build/bin/hip_dlconvertor_test  
2025-12-04T11:12:17.8566212Z   inflating: build/bin/hip_vectorized_test  
2025-12-04T11:12:17.9317903Z   inflating: build/bin/test_jit      
2025-12-04T11:12:17.9537909Z   inflating: build/bin/test_lazy     
2025-12-04T11:12:17.9575144Z   inflating: build/bin/test_dist_autograd  
2025-12-04T11:12:17.9620861Z   inflating: build/bin/test_cpp_rpc  
2025-12-04T11:12:17.9622349Z   inflating: build/bin/parallel_benchmark  
2025-12-04T11:12:18.0353761Z   inflating: build/bin/test_api      
2025-12-04T11:12:18.0354166Z    creating: .additional_ci_files/
2025-12-04T11:12:18.0392859Z   inflating: .additional_ci_files/test-times.json  
2025-12-04T11:12:18.0536253Z   inflating: .additional_ci_files/test-class-times.json  
2025-12-04T11:12:18.0562532Z ##[group]Run rm artifacts.zip
2025-12-04T11:12:18.0562722Z [36;1mrm artifacts.zip[0m
2025-12-04T11:12:18.0567958Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:18.0568125Z env:
2025-12-04T11:12:18.0568285Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:18.0568433Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:18.0568625Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:18.0568808Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:18.0569346Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:18.0569875Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:18.0570001Z   AWS_REGION: us-east-1
2025-12-04T11:12:18.0570196Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:18.0570366Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:18.0572521Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:18.0572642Z ##[endgroup]
2025-12-04T11:12:18.1699888Z ##[group]Run df -H
2025-12-04T11:12:18.1700087Z [36;1mdf -H[0m
2025-12-04T11:12:18.1705881Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:18.1706064Z env:
2025-12-04T11:12:18.1706183Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:18.1706355Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:18.1706572Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:18.1706774Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:18.1707361Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:18.1707962Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:18.1708115Z   AWS_REGION: us-east-1
2025-12-04T11:12:18.1708565Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:18.1708746Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:18.1711094Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:18.1711230Z ##[endgroup]
2025-12-04T11:12:18.2430017Z Filesystem      Size  Used Avail Use% Mounted on
2025-12-04T11:12:18.2430236Z overlay          16T  544G   15T   4% /
2025-12-04T11:12:18.2430381Z tmpfs            68M     0   68M   0% /dev
2025-12-04T11:12:18.2430524Z /dev/md0         16T  544G   15T   4% /run
2025-12-04T11:12:18.2430667Z shm              68M   17k   68M   1% /dev/shm
2025-12-04T11:12:18.2430839Z amdprj2-k8s_2   5.5T  120G  5.4T   3% /home/runner/pytorch-data
2025-12-04T11:12:18.2431036Z tmpfs           3.3T   13k  3.3T   1% /run/secrets/kubernetes.io/serviceaccount
2025-12-04T11:12:18.2431204Z tmpfs           1.7T     0  1.7T   0% /proc/acpi
2025-12-04T11:12:18.2431799Z tmpfs           1.7T     0  1.7T   0% /proc/scsi
2025-12-04T11:12:18.2432054Z tmpfs           1.7T     0  1.7T   0% /sys/firmware
2025-12-04T11:12:18.2432265Z tmpfs           1.7T     0  1.7T   0% /sys/devices/virtual/powercap
2025-12-04T11:12:18.2462600Z Prepare all required actions
2025-12-04T11:12:18.2462853Z Getting action download info
2025-12-04T11:12:18.4637760Z ##[group]Run ./.github/actions/download-td-artifacts
2025-12-04T11:12:18.4637902Z with:
2025-12-04T11:12:18.4637994Z env:
2025-12-04T11:12:18.4638089Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:18.4638296Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:18.4638470Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:18.4638636Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:18.4639141Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:18.4639640Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:18.4639772Z   AWS_REGION: us-east-1
2025-12-04T11:12:18.4639950Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:18.4640097Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:18.4642103Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:18.4642205Z ##[endgroup]
2025-12-04T11:12:18.4654937Z ##[group]Run seemethere/download-artifact-s3@v4
2025-12-04T11:12:18.4655076Z with:
2025-12-04T11:12:18.4655174Z   name: td_results
2025-12-04T11:12:18.4655276Z   s3-bucket: gha-artifacts
2025-12-04T11:12:18.4655384Z   region: us-east-1
2025-12-04T11:12:18.4655481Z env:
2025-12-04T11:12:18.4655572Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:18.4655710Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:18.4655886Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:18.4656057Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:18.4656561Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:18.4657055Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:18.4657172Z   AWS_REGION: us-east-1
2025-12-04T11:12:18.4657303Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:18.4657451Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:18.4659495Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:18.4659604Z ##[endgroup]
2025-12-04T11:12:18.6977317Z (node:20367) NOTE: We are formalizing our plans to enter AWS SDK for JavaScript (v2) into maintenance mode in 2023.
2025-12-04T11:12:18.6978245Z 
2025-12-04T11:12:18.6979146Z Please migrate your code to use AWS SDK for JavaScript (v3).
2025-12-04T11:12:18.6979887Z For more information, check the migration guide at https://a.co/7PzMCcy
2025-12-04T11:12:18.6980479Z (Use `node --trace-warnings ...` to show where the warning was created)
2025-12-04T11:12:18.9725362Z Found 1 objects with prefix pytorch/pytorch/19922798714/td_results/
2025-12-04T11:12:18.9725755Z Starting download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json
2025-12-04T11:12:19.4355662Z Finished download (1/1): /home/runner/_work/pytorch/pytorch/td_results.json
2025-12-04T11:12:19.4359652Z Artifact download has finished successfully
2025-12-04T11:12:19.4556195Z ##[group]Run mkdir -p .additional_ci_files
2025-12-04T11:12:19.4556414Z [36;1mmkdir -p .additional_ci_files[0m
2025-12-04T11:12:19.4556626Z [36;1mmv td_results.json .additional_ci_files/td_results.json || true[0m
2025-12-04T11:12:19.4561546Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:19.4561734Z env:
2025-12-04T11:12:19.4561852Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:19.4562181Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:19.4562402Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:19.4562615Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:19.4563385Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:19.4563890Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:19.4564012Z   AWS_REGION: us-east-1
2025-12-04T11:12:19.4564286Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:19.4564448Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:19.4566459Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:19.4566571Z ##[endgroup]
2025-12-04T11:12:19.4640822Z ##[group]Run .github/scripts/parse_ref.py
2025-12-04T11:12:19.4641055Z [36;1m.github/scripts/parse_ref.py[0m
2025-12-04T11:12:19.4645743Z shell: /usr/bin/bash -e {0}
2025-12-04T11:12:19.4645864Z env:
2025-12-04T11:12:19.4645971Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:19.4646116Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:19.4646305Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:19.4646481Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:19.4647000Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:19.4647505Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:19.4647630Z   AWS_REGION: us-east-1
2025-12-04T11:12:19.4647836Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:19.4647995Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:19.4650219Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:19.4650336Z ##[endgroup]
2025-12-04T11:12:19.4760600Z Setting output branch=main
2025-12-04T11:12:19.4831662Z Prepare all required actions
2025-12-04T11:12:19.4831880Z Getting action download info
2025-12-04T11:12:19.6742867Z ##[group]Run ./.github/actions/filter-test-configs
2025-12-04T11:12:19.6743037Z with:
2025-12-04T11:12:19.6743286Z   github-token: ***
2025-12-04T11:12:19.6744643Z   test-matrix: {"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}
2025-12-04T11:12:19.6746211Z   job-name: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:12:19.6746471Z env:
2025-12-04T11:12:19.6746579Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:19.6746730Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:19.6746923Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:19.6747245Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:19.6747767Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:19.6748337Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:19.6748466Z   AWS_REGION: us-east-1
2025-12-04T11:12:19.6748607Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:19.6748769Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:19.6750806Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:19.6750922Z ##[endgroup]
2025-12-04T11:12:19.6769039Z ##[group]Run nick-fields/retry@v3.0.0
2025-12-04T11:12:19.6769177Z with:
2025-12-04T11:12:19.6769270Z   shell: bash
2025-12-04T11:12:19.6769367Z   timeout_minutes: 10
2025-12-04T11:12:19.6769472Z   max_attempts: 5
2025-12-04T11:12:19.6769574Z   retry_wait_seconds: 30
2025-12-04T11:12:19.6769889Z   command: set -eux
# PyYAML 6.0 doesn't work with MacOS x86 anymore
# This must run on Python-3.7 (AmazonLinux2) so can't use request=3.32.2
python3 -m pip install requests==2.27.1 pyyaml==6.0.2

2025-12-04T11:12:19.6770190Z   polling_interval_seconds: 1
2025-12-04T11:12:19.6770304Z   warning_on_retry: true
2025-12-04T11:12:19.6770411Z   continue_on_error: false
2025-12-04T11:12:19.6770519Z env:
2025-12-04T11:12:19.6770610Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:19.6770743Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:19.6770925Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:19.6771123Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:19.6771640Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:19.6772311Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:19.6772433Z   AWS_REGION: us-east-1
2025-12-04T11:12:19.6772583Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:19.6772739Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:19.6774770Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:19.6774954Z   GITHUB_TOKEN: ***
2025-12-04T11:12:19.6775061Z ##[endgroup]
2025-12-04T11:12:19.7161877Z + python3 -m pip install requests==2.27.1 pyyaml==6.0.2
2025-12-04T11:12:19.8575120Z Defaulting to user installation because normal site-packages is not writeable
2025-12-04T11:12:20.0136127Z Collecting requests==2.27.1
2025-12-04T11:12:20.0468730Z   Downloading requests-2.27.1-py2.py3-none-any.whl (63 kB)
2025-12-04T11:12:20.0775047Z      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.1/63.1 KB 1.8 MB/s eta 0:00:00
2025-12-04T11:12:20.1273888Z Collecting pyyaml==6.0.2
2025-12-04T11:12:20.1329331Z   Downloading PyYAML-6.0.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (751 kB)
2025-12-04T11:12:20.1875506Z      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 751.2/751.2 KB 14.2 MB/s eta 0:00:00
2025-12-04T11:12:20.2083792Z Collecting idna<4,>=2.5
2025-12-04T11:12:20.2139671Z   Downloading idna-3.11-py3-none-any.whl (71 kB)
2025-12-04T11:12:20.2168519Z      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.0/71.0 KB 33.2 MB/s eta 0:00:00
2025-12-04T11:12:20.2357145Z Collecting certifi>=2017.4.17
2025-12-04T11:12:20.2411958Z   Downloading certifi-2025.11.12-py3-none-any.whl (159 kB)
2025-12-04T11:12:20.2476360Z      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.4/159.4 KB 28.0 MB/s eta 0:00:00
2025-12-04T11:12:20.2759652Z Collecting urllib3<1.27,>=1.21.1
2025-12-04T11:12:20.2814039Z   Downloading urllib3-1.26.20-py2.py3-none-any.whl (144 kB)
2025-12-04T11:12:20.2873357Z      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 144.2/144.2 KB 27.9 MB/s eta 0:00:00
2025-12-04T11:12:20.3775137Z Collecting charset-normalizer~=2.0.0
2025-12-04T11:12:20.3831605Z   Downloading charset_normalizer-2.0.12-py3-none-any.whl (39 kB)
2025-12-04T11:12:20.4361432Z Installing collected packages: urllib3, pyyaml, idna, charset-normalizer, certifi, requests
2025-12-04T11:12:20.5286723Z   WARNING: The script normalizer is installed in '/home/runner/.local/bin' which is not on PATH.
2025-12-04T11:12:20.5287058Z   Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
2025-12-04T11:12:20.5457740Z Successfully installed certifi-2025.11.12 charset-normalizer-2.0.12 idna-3.11 pyyaml-6.0.2 requests-2.27.1 urllib3-1.26.20
2025-12-04T11:12:20.7163995Z Command completed after 1 attempt(s).
2025-12-04T11:12:20.7214829Z ##[group]Run set -x
2025-12-04T11:12:20.7215062Z [36;1mset -x[0m
2025-12-04T11:12:20.7215217Z [36;1m[0m
2025-12-04T11:12:20.7215473Z [36;1m# Use relative path here as this could be checked out anywhere, not necessarily[0m
2025-12-04T11:12:20.7215778Z [36;1m# in runner workspace[0m
2025-12-04T11:12:20.7216035Z [36;1mpython3 "${GITHUB_ACTION_PATH}/../../scripts/parse_ref.py"[0m
2025-12-04T11:12:20.7222200Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:20.7222400Z env:
2025-12-04T11:12:20.7222526Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:20.7222702Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:20.7222920Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:20.7223132Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:20.7223781Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:20.7224414Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:20.7224564Z   AWS_REGION: us-east-1
2025-12-04T11:12:20.7224814Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:20.7225018Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:20.7227538Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:20.7227846Z ##[endgroup]
2025-12-04T11:12:20.7244536Z + python3 /home/runner/_work/pytorch/pytorch/./.github/actions/filter-test-configs/../../scripts/parse_ref.py
2025-12-04T11:12:20.7321876Z Setting output branch=main
2025-12-04T11:12:20.7352738Z ##[group]Run echo "Workflow: ${GITHUB_WORKFLOW}"
2025-12-04T11:12:20.7352912Z [36;1mecho "Workflow: ${GITHUB_WORKFLOW}"[0m
2025-12-04T11:12:20.7353047Z [36;1mecho "Job name: ${JOB_NAME}"[0m
2025-12-04T11:12:20.7353166Z [36;1m[0m
2025-12-04T11:12:20.7353315Z [36;1m# Use relative path here as this could be checked out anywhere, not necessarily[0m
2025-12-04T11:12:20.7353497Z [36;1m# in runner workspace[0m
2025-12-04T11:12:20.7353663Z [36;1mpython3 "${GITHUB_ACTION_PATH}/../../scripts/filter_test_configs.py" \[0m
2025-12-04T11:12:20.7353847Z [36;1m  --workflow "${GITHUB_WORKFLOW}" \[0m
2025-12-04T11:12:20.7353976Z [36;1m  --job-name "${JOB_NAME}" \[0m
2025-12-04T11:12:20.7355309Z [36;1m  --test-matrix "{"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}" \[0m
2025-12-04T11:12:20.7356792Z [36;1m  --selected-test-configs "" \[0m
2025-12-04T11:12:20.7356924Z [36;1m  --pr-number "${PR_NUMBER}" \[0m
2025-12-04T11:12:20.7357048Z [36;1m  --tag "${TAG}" \[0m
2025-12-04T11:12:20.7357166Z [36;1m  --event-name "${EVENT_NAME}" \[0m
2025-12-04T11:12:20.7357289Z [36;1m  --schedule "${SCHEDULE}" \[0m
2025-12-04T11:12:20.7357412Z [36;1m  --branch "${HEAD_BRANCH}"[0m
2025-12-04T11:12:20.7361958Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:20.7362107Z env:
2025-12-04T11:12:20.7362200Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:20.7362333Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:20.7362509Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:20.7362674Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:20.7363185Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:20.7363672Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:20.7363788Z   AWS_REGION: us-east-1
2025-12-04T11:12:20.7363950Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:20.7364098Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:20.7366089Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:20.7366276Z   GITHUB_TOKEN: ***
2025-12-04T11:12:20.7366511Z   JOB_NAME: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:12:20.7366755Z   PR_NUMBER: 
2025-12-04T11:12:20.7366844Z   TAG: 
2025-12-04T11:12:20.7366931Z   EVENT_NAME: schedule
2025-12-04T11:12:20.7367032Z   SCHEDULE: 29 8 * * *
2025-12-04T11:12:20.7367134Z   HEAD_BRANCH: main
2025-12-04T11:12:20.7367232Z ##[endgroup]
2025-12-04T11:12:20.7381749Z Workflow: periodic-rocm-mi300
2025-12-04T11:12:20.7382014Z Job name: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:12:21.2969611Z Setting output keep-going=True
2025-12-04T11:12:21.2970091Z Setting output ci-verbose-test-logs=False
2025-12-04T11:12:21.2970486Z Setting output ci-test-showlocals=False
2025-12-04T11:12:21.2970867Z Setting output ci-no-test-timeout=False
2025-12-04T11:12:21.2971219Z Setting output ci-no-td=False
2025-12-04T11:12:21.2971556Z Setting output ci-td-distributed=False
2025-12-04T11:12:21.2971914Z Setting output is-unstable=False
2025-12-04T11:12:21.2972247Z Setting output reenabled-issues=
2025-12-04T11:12:21.2978462Z Setting output test-matrix={"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}
2025-12-04T11:12:21.2983839Z Setting output is-test-matrix-empty=False
2025-12-04T11:12:21.3060548Z ##[group]Run echo "Filtered matrix:"
2025-12-04T11:12:21.3060798Z [36;1mecho "Filtered matrix:"[0m
2025-12-04T11:12:21.3064425Z [36;1mecho "{"include": [{"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 1, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 2, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "mem_leak_check": "mem_leak_check", "rerun_disabled_tests": "rerun_disabled_tests"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests", "mem_leak_check": "mem_leak_check"}, {"config": "distributed", "shard": 3, "num_shards": 3, "runner": "linux.rocm.gpu.gfx942.4.b", "owners": ["module:rocm", "oncall:distributed"], "rerun_disabled_tests": "rerun_disabled_tests"}]}"[0m
2025-12-04T11:12:21.3067787Z [36;1m[0m
2025-12-04T11:12:21.3067895Z [36;1mecho[0m
2025-12-04T11:12:21.3068041Z [36;1mecho "Is the current job unstable? False"[0m
2025-12-04T11:12:21.3068243Z [36;1m[0m
2025-12-04T11:12:21.3068343Z [36;1mecho[0m
2025-12-04T11:12:21.3068472Z [36;1mecho "Is keep-going label set? True"[0m
2025-12-04T11:12:21.3068623Z [36;1m[0m
2025-12-04T11:12:21.3068725Z [36;1mecho[0m
2025-12-04T11:12:21.3068842Z [36;1mecho "Reenabled issues? "[0m
2025-12-04T11:12:21.3073337Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:21.3073492Z env:
2025-12-04T11:12:21.3073593Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:21.3073735Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:21.3073916Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:21.3074088Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:21.3074602Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:21.3075098Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:21.3075220Z   AWS_REGION: us-east-1
2025-12-04T11:12:21.3075406Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:21.3075624Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:21.3077795Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:21.3077910Z ##[endgroup]
2025-12-04T11:12:21.3103066Z Filtered matrix:
2025-12-04T11:12:21.3106118Z {include: [{config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: distributed, shard: 1, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: distributed, shard: 2, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], mem_leak_check: mem_leak_check, rerun_disabled_tests: rerun_disabled_tests}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests, mem_leak_check: mem_leak_check}, {config: distributed, shard: 3, num_shards: 3, runner: linux.rocm.gpu.gfx942.4.b, owners: [module:rocm, oncall:distributed], rerun_disabled_tests: rerun_disabled_tests}]}
2025-12-04T11:12:21.3109001Z 
2025-12-04T11:12:21.3109055Z Is the current job unstable? False
2025-12-04T11:12:21.3109143Z 
2025-12-04T11:12:21.3109197Z Is keep-going label set? True
2025-12-04T11:12:21.3109424Z 
2025-12-04T11:12:21.3109467Z Reenabled issues? 
2025-12-04T11:12:21.3140521Z ##[group]Run echo "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}"
2025-12-04T11:12:21.3140915Z [36;1mecho "timeout=$((JOB_TIMEOUT-30))" >> "${GITHUB_OUTPUT}"[0m
2025-12-04T11:12:21.3144964Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:21.3145192Z env:
2025-12-04T11:12:21.3145337Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:21.3145544Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:21.3145814Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:21.3146068Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:21.3146814Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:21.3147590Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:21.3147776Z   AWS_REGION: us-east-1
2025-12-04T11:12:21.3147980Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:21.3148264Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:21.3151267Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:21.3151430Z   JOB_TIMEOUT: 600
2025-12-04T11:12:21.3151580Z ##[endgroup]
2025-12-04T11:12:21.3184601Z ##[group]Run env | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}"
2025-12-04T11:12:21.3184827Z [36;1menv | grep '^GITHUB' >> "/tmp/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:12:21.3185024Z [36;1menv | grep '^CI' >> "/tmp/github_env_${GITHUB_RUN_ID}"[0m
2025-12-04T11:12:21.3189695Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T11:12:21.3189848Z env:
2025-12-04T11:12:21.3189945Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:21.3190083Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:21.3190265Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:21.3190446Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:21.3190967Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:21.3191465Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:21.3191590Z   AWS_REGION: us-east-1
2025-12-04T11:12:21.3191762Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:21.3191923Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:21.3193942Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:21.3194053Z ##[endgroup]
2025-12-04T11:12:21.3268402Z ##[group]Run set -x
2025-12-04T11:12:21.3268572Z [36;1mset -x[0m
2025-12-04T11:12:21.3268675Z [36;1m[0m
2025-12-04T11:12:21.3268797Z [36;1mif [[ $TEST_CONFIG == 'multigpu' ]]; then[0m
2025-12-04T11:12:21.3268965Z [36;1m  TEST_COMMAND=.ci/pytorch/multigpu-test.sh[0m
2025-12-04T11:12:21.3269141Z [36;1melif [[ $BUILD_ENVIRONMENT == *onnx* ]]; then[0m
2025-12-04T11:12:21.3269292Z [36;1m  TEST_COMMAND=.ci/caffe2/test.sh[0m
2025-12-04T11:12:21.3269421Z [36;1melse[0m
2025-12-04T11:12:21.3269536Z [36;1m  TEST_COMMAND=.ci/pytorch/test.sh[0m
2025-12-04T11:12:21.3269661Z [36;1mfi[0m
2025-12-04T11:12:21.3269755Z [36;1m[0m
2025-12-04T11:12:21.3269898Z [36;1m# detached container should get cleaned up by teardown_ec2_linux[0m
2025-12-04T11:12:21.3270110Z [36;1m# TODO: Stop building test binaries as part of the build phase[0m
2025-12-04T11:12:21.3270294Z [36;1m# Used for GPU_FLAG since that doesn't play nice[0m
2025-12-04T11:12:21.3270471Z [36;1m# shellcheck disable=SC2086,SC2090[0m
2025-12-04T11:12:21.3270612Z [36;1mcontainer_name=$(docker run \[0m
2025-12-04T11:12:21.3270743Z [36;1m  ${GPU_FLAG:-} \[0m
2025-12-04T11:12:21.3270865Z [36;1m  -e BUILD_ENVIRONMENT \[0m
2025-12-04T11:12:21.3270993Z [36;1m  -e PR_NUMBER \[0m
2025-12-04T11:12:21.3271240Z [36;1m  -e GITHUB_ACTIONS \[0m
2025-12-04T11:12:21.3271365Z [36;1m  -e GITHUB_REPOSITORY \[0m
2025-12-04T11:12:21.3271492Z [36;1m  -e GITHUB_WORKFLOW \[0m
2025-12-04T11:12:21.3271610Z [36;1m  -e GITHUB_JOB \[0m
2025-12-04T11:12:21.3271729Z [36;1m  -e GITHUB_RUN_ID \[0m
2025-12-04T11:12:21.3271848Z [36;1m  -e GITHUB_RUN_NUMBER \[0m
2025-12-04T11:12:21.3271968Z [36;1m  -e GITHUB_RUN_ATTEMPT \[0m
2025-12-04T11:12:21.3272087Z [36;1m  -e JOB_ID \[0m
2025-12-04T11:12:21.3272197Z [36;1m  -e JOB_NAME \[0m
2025-12-04T11:12:21.3272309Z [36;1m  -e BASE_SHA \[0m
2025-12-04T11:12:21.3272416Z [36;1m  -e BRANCH \[0m
2025-12-04T11:12:21.3272519Z [36;1m  -e SHA1 \[0m
2025-12-04T11:12:21.3272631Z [36;1m  -e AWS_DEFAULT_REGION \[0m
2025-12-04T11:12:21.3272750Z [36;1m  -e IN_WHEEL_TEST \[0m
2025-12-04T11:12:21.3272864Z [36;1m  -e SHARD_NUMBER \[0m
2025-12-04T11:12:21.3272975Z [36;1m  -e TEST_CONFIG \[0m
2025-12-04T11:12:21.3273094Z [36;1m  -e NUM_TEST_SHARDS \[0m
2025-12-04T11:12:21.3273219Z [36;1m  -e REENABLED_ISSUES \[0m
2025-12-04T11:12:21.3273347Z [36;1m  -e CONTINUE_THROUGH_ERROR \[0m
2025-12-04T11:12:21.3273480Z [36;1m  -e VERBOSE_TEST_LOGS \[0m
2025-12-04T11:12:21.3273608Z [36;1m  -e TEST_SHOWLOCALS \[0m
2025-12-04T11:12:21.3273731Z [36;1m  -e NO_TEST_TIMEOUT \[0m
2025-12-04T11:12:21.3273850Z [36;1m  -e NO_TD \[0m
2025-12-04T11:12:21.3273974Z [36;1m  -e MAX_JOBS="$(nproc --ignore=2)" \[0m
2025-12-04T11:12:21.3274127Z [36;1m  -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK \[0m
2025-12-04T11:12:21.3274279Z [36;1m  -e PYTORCH_TEST_RERUN_DISABLED_TESTS \[0m
2025-12-04T11:12:21.3274423Z [36;1m  -e TESTS_TO_INCLUDE \[0m
2025-12-04T11:12:21.3274553Z [36;1m  -e HUGGING_FACE_HUB_TOKEN \[0m
2025-12-04T11:12:21.3274689Z [36;1m  -e DASHBOARD_TAG \[0m
2025-12-04T11:12:21.3274847Z [36;1m  --env-file="${RUNNER_TEMP}/github_env_${GITHUB_RUN_ID}" \[0m
2025-12-04T11:12:21.3275019Z [36;1m  --ulimit stack=10485760:83886080 \[0m
2025-12-04T11:12:21.3275155Z [36;1m  --ulimit core=0 \[0m
2025-12-04T11:12:21.3275308Z [36;1m  --env-file="/tmp/github_env_${GITHUB_RUN_ID}" \[0m
2025-12-04T11:12:21.3275474Z [36;1m  --security-opt seccomp=unconfined \[0m
2025-12-04T11:12:21.3275622Z [36;1m  --cap-add=SYS_PTRACE \[0m
2025-12-04T11:12:21.3275752Z [36;1m  --shm-size="8g" \[0m
2025-12-04T11:12:21.3275872Z [36;1m  --tty \[0m
2025-12-04T11:12:21.3275981Z [36;1m  --detach \[0m
2025-12-04T11:12:21.3276104Z [36;1m  --name="${container_name}" \[0m
2025-12-04T11:12:21.3276239Z [36;1m  --user jenkins \[0m
2025-12-04T11:12:21.3276390Z [36;1m  -v "${GITHUB_WORKSPACE}:/var/lib/jenkins/workspace" \[0m
2025-12-04T11:12:21.3276556Z [36;1m  -w /var/lib/jenkins/workspace \[0m
2025-12-04T11:12:21.3276754Z [36;1m  "${DOCKER_IMAGE}"[0m
2025-12-04T11:12:21.3276871Z [36;1m)[0m
2025-12-04T11:12:21.3276987Z [36;1m# save container name for later step[0m
2025-12-04T11:12:21.3277155Z [36;1mecho "CONTAINER_NAME=${container_name}" >> "$GITHUB_ENV"[0m
2025-12-04T11:12:21.3277438Z [36;1m# jenkins user does not have write permission to mounted workspace; work-around by copying within container to jenkins home[0m
2025-12-04T11:12:21.3277794Z [36;1mdocker exec -t "${container_name}" sh -c "cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && ${TEST_COMMAND}"[0m
2025-12-04T11:12:21.3282292Z shell: /usr/bin/bash -e {0}
2025-12-04T11:12:21.3282420Z env:
2025-12-04T11:12:21.3282526Z   GIT_DEFAULT_BRANCH: main
2025-12-04T11:12:21.3282670Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T11:12:21.3282861Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T11:12:21.3283039Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T11:12:21.3283563Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T11:12:21.3284101Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T11:12:21.3284226Z   AWS_REGION: us-east-1
2025-12-04T11:12:21.3284406Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T11:12:21.3284564Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T11:12:21.3286577Z   AWS_SESSION_TOKEN: ***
2025-12-04T11:12:21.3286716Z   BUILD_ENVIRONMENT: linux-noble-rocm-py3.12-mi300
2025-12-04T11:12:21.3286859Z   PR_NUMBER: 
2025-12-04T11:12:21.3286969Z   GITHUB_REPOSITORY: pytorch/pytorch
2025-12-04T11:12:21.3287111Z   GITHUB_WORKFLOW: periodic-rocm-mi300
2025-12-04T11:12:21.3287243Z   GITHUB_JOB: test
2025-12-04T11:12:21.3287358Z   GITHUB_RUN_ID: 19922798714
2025-12-04T11:12:21.3287477Z   GITHUB_RUN_NUMBER: 1861
2025-12-04T11:12:21.3287591Z   GITHUB_RUN_ATTEMPT: 1
2025-12-04T11:12:21.3287702Z   JOB_ID: 57117547540
2025-12-04T11:12:21.3287950Z   JOB_NAME: linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:12:21.3288247Z   BRANCH: main
2025-12-04T11:12:21.3288368Z   SHA1: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:21.3288534Z   BASE_SHA: ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:21.3288680Z   TEST_CONFIG: distributed
2025-12-04T11:12:21.3288795Z   SHARD_NUMBER: 2
2025-12-04T11:12:21.3288900Z   NUM_TEST_SHARDS: 3
2025-12-04T11:12:21.3289008Z   REENABLED_ISSUES: 
2025-12-04T11:12:21.3289121Z   CONTINUE_THROUGH_ERROR: True
2025-12-04T11:12:21.3289244Z   VERBOSE_TEST_LOGS: False
2025-12-04T11:12:21.3289363Z   TEST_SHOWLOCALS: False
2025-12-04T11:12:21.3289478Z   NO_TEST_TIMEOUT: False
2025-12-04T11:12:21.3289587Z   NO_TD: False
2025-12-04T11:12:21.3289865Z   DOCKER_IMAGE: 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:12:21.3290167Z   PYTORCH_TEST_CUDA_MEM_LEAK_CHECK: 1
2025-12-04T11:12:21.3290304Z   PYTORCH_TEST_RERUN_DISABLED_TESTS: 0
2025-12-04T11:12:21.3290432Z   TESTS_TO_INCLUDE: 
2025-12-04T11:12:21.3290542Z   DASHBOARD_TAG: 
2025-12-04T11:12:21.3290694Z   HUGGING_FACE_HUB_TOKEN: ***
2025-12-04T11:12:21.3290816Z ##[endgroup]
2025-12-04T11:12:21.3308694Z + [[ distributed == \m\u\l\t\i\g\p\u ]]
2025-12-04T11:12:21.3308854Z + [[ linux-noble-rocm-py3.12-mi300 == *onnx* ]]
2025-12-04T11:12:21.3308998Z + TEST_COMMAND=.ci/pytorch/test.sh
2025-12-04T11:12:21.3317639Z +++ nproc --ignore=2
2025-12-04T11:12:21.3329005Z ++ docker run --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host -e BUILD_ENVIRONMENT -e PR_NUMBER -e GITHUB_ACTIONS -e GITHUB_REPOSITORY -e GITHUB_WORKFLOW -e GITHUB_JOB -e GITHUB_RUN_ID -e GITHUB_RUN_NUMBER -e GITHUB_RUN_ATTEMPT -e JOB_ID -e JOB_NAME -e BASE_SHA -e BRANCH -e SHA1 -e AWS_DEFAULT_REGION -e IN_WHEEL_TEST -e SHARD_NUMBER -e TEST_CONFIG -e NUM_TEST_SHARDS -e REENABLED_ISSUES -e CONTINUE_THROUGH_ERROR -e VERBOSE_TEST_LOGS -e TEST_SHOWLOCALS -e NO_TEST_TIMEOUT -e NO_TD -e MAX_JOBS=254 -e PYTORCH_TEST_CUDA_MEM_LEAK_CHECK -e PYTORCH_TEST_RERUN_DISABLED_TESTS -e TESTS_TO_INCLUDE -e HUGGING_FACE_HUB_TOKEN -e DASHBOARD_TAG --env-file=/home/runner/_work/_temp/github_env_19922798714 --ulimit stack=10485760:83886080 --ulimit core=0 --env-file=/tmp/github_env_19922798714 --security-opt seccomp=unconfined --cap-add=SYS_PTRACE --shm-size=8g --tty --detach --name= --user jenkins -v /home/runner/_work/pytorch/pytorch:/var/lib/jenkins/workspace -w /var/lib/jenkins/workspace 308535385114.dkr.ecr.us-east-1.amazonaws.com/pytorch/ci-image:pytorch-linux-noble-rocm-n-py3-f0cd68561080d537ef3d3d6f81b25a6416ad600a
2025-12-04T11:12:21.4772079Z + container_name=5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T11:12:21.4772372Z + echo CONTAINER_NAME=5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T11:12:21.4773524Z + docker exec -t 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d sh -c 'cd .. && cp -R workspace pytorch && cd pytorch && pip install dist/*.whl && .ci/pytorch/test.sh'
2025-12-04T11:12:24.8909942Z Processing ./dist/torch-2.10.0a0+gitffd9b0f-cp312-cp312-linux_x86_64.whl
2025-12-04T11:12:25.4258736Z Requirement already satisfied: filelock in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.18.0)
2025-12-04T11:12:25.4259200Z Requirement already satisfied: typing-extensions>=4.10.0 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (4.12.2)
2025-12-04T11:12:25.4260585Z Requirement already satisfied: setuptools in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (78.1.1)
2025-12-04T11:12:25.4263487Z Requirement already satisfied: sympy>=1.13.3 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (1.13.3)
2025-12-04T11:12:25.4264402Z Requirement already satisfied: networkx>=2.5.1 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (2.8.8)
2025-12-04T11:12:25.4265138Z Requirement already satisfied: jinja2 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (3.1.6)
2025-12-04T11:12:25.4266136Z Requirement already satisfied: fsspec>=0.8.5 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from torch==2.10.0a0+gitffd9b0f) (2025.10.0)
2025-12-04T11:12:25.4312820Z Requirement already satisfied: mpmath<1.4,>=1.1.0 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from sympy>=1.13.3->torch==2.10.0a0+gitffd9b0f) (1.3.0)
2025-12-04T11:12:25.4337131Z Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/envs/py_3.12/lib/python3.12/site-packages (from jinja2->torch==2.10.0a0+gitffd9b0f) (3.0.3)
2025-12-04T11:12:25.5682737Z Installing collected packages: torch
2025-12-04T11:12:31.4322131Z Successfully installed torch-2.10.0a0+gitffd9b0f
2025-12-04T11:12:31.4736486Z + export TERM=vt100
2025-12-04T11:12:31.4736683Z + TERM=vt100
2025-12-04T11:12:31.4740935Z ++ dirname .ci/pytorch/test.sh
2025-12-04T11:12:31.4752535Z + source .ci/pytorch/common.sh
2025-12-04T11:12:31.4757580Z +++ dirname .ci/pytorch/common.sh
2025-12-04T11:12:31.4769096Z ++ source .ci/pytorch/common_utils.sh
2025-12-04T11:12:31.4770701Z +++ declare -f -t trap_add
2025-12-04T11:12:31.4776245Z ++ set -ex -o pipefail
2025-12-04T11:12:31.4776468Z ++ [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]]
2025-12-04T11:12:31.4776705Z ++ unset HIP_PLATFORM
2025-12-04T11:12:31.4776888Z ++ export PYTORCH_TEST_WITH_ROCM=1
2025-12-04T11:12:31.4777101Z ++ PYTORCH_TEST_WITH_ROCM=1
2025-12-04T11:12:31.4778946Z ++ BUILD_TEST_LIBTORCH=0
2025-12-04T11:12:31.4785085Z ++ dirname .ci/pytorch/test.sh
2025-12-04T11:12:31.4796241Z + source .ci/pytorch/common-build.sh
2025-12-04T11:12:31.4799698Z ++ [[ linux-noble-rocm-py3.12-mi300 != *win-* ]]
2025-12-04T11:12:31.4810146Z ++++ dirname .ci/pytorch/common-build.sh
2025-12-04T11:12:31.4821686Z +++ cd .ci/pytorch
2025-12-04T11:12:31.4821829Z +++ pwd -P
2025-12-04T11:12:31.4824521Z ++ script_dir=/var/lib/jenkins/pytorch/.ci/pytorch
2025-12-04T11:12:31.4824881Z ++ [[ linux-noble-rocm-py3.12-mi300 == *-pch* ]]
2025-12-04T11:12:31.4825078Z ++ which sccache
2025-12-04T11:12:31.4839410Z ++ [[ -z '' ]]
2025-12-04T11:12:31.4839541Z ++ unset SCCACHE_BUCKET
2025-12-04T11:12:31.4839673Z ++ unset SCCACHE_REGION
2025-12-04T11:12:31.4839797Z ++ sccache --stop-server
2025-12-04T11:12:31.4862540Z ++ true
2025-12-04T11:12:31.4862674Z ++ rm -f /var/lib/jenkins/sccache_error.log
2025-12-04T11:12:31.4871589Z ++ trap_add sccache_epilogue EXIT
2025-12-04T11:12:31.4871738Z ++ trap_add_cmd=sccache_epilogue
2025-12-04T11:12:31.4871884Z ++ shift
2025-12-04T11:12:31.4871997Z ++ for trap_add_name in "$@"
2025-12-04T11:12:31.4880814Z ++++ trap -p EXIT
2025-12-04T11:12:31.4882680Z +++ eval 'extract_trap_cmd '
2025-12-04T11:12:31.4882814Z ++++ extract_trap_cmd
2025-12-04T11:12:31.4882946Z ++++ printf '%s\n' ''
2025-12-04T11:12:31.4883554Z +++ printf '%s\n' sccache_epilogue
2025-12-04T11:12:31.4885250Z ++ trap -- '
2025-12-04T11:12:31.4885365Z sccache_epilogue' EXIT
2025-12-04T11:12:31.4885583Z ++ [[ -n '' ]]
2025-12-04T11:12:31.4885717Z ++ [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]]
2025-12-04T11:12:31.4886000Z ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
2025-12-04T11:12:31.4886160Z ++ SCCACHE_IDLE_TIMEOUT=0
2025-12-04T11:12:31.4886282Z ++ sccache --start-server
2025-12-04T11:12:31.4902126Z sccache: Starting the server...
2025-12-04T11:12:31.5237830Z sccache: Listening on address 127.0.0.1:4226
2025-12-04T11:12:31.5249604Z ++ sccache --zero-stats
2025-12-04T11:12:31.5264493Z Statistics zeroed.
2025-12-04T11:12:31.5267882Z ++ which ccache
2025-12-04T11:12:31.5275724Z + [[ linux-noble-rocm-py3.12-mi300 != *rocm* ]]
2025-12-04T11:12:31.5275884Z + [[ linux-noble-rocm-py3.12-mi300 == *cuda* ]]
2025-12-04T11:12:31.5276032Z + echo 'Environment variables:'
2025-12-04T11:12:31.5276160Z Environment variables:
2025-12-04T11:12:31.5276272Z + env
2025-12-04T11:12:31.5285360Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch
2025-12-04T11:12:31.5285535Z CONTINUE_THROUGH_ERROR=True
2025-12-04T11:12:31.5285676Z BUILD_ENVIRONMENT=linux-noble-rocm-py3.12-mi300
2025-12-04T11:12:31.5285855Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv
2025-12-04T11:12:31.5286099Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5286312Z GITHUB_ACTION=__run_2
2025-12-04T11:12:31.5286430Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1
2025-12-04T11:12:31.5286557Z GITHUB_RUN_NUMBER=1861
2025-12-04T11:12:31.5286666Z TEST_CONFIG=distributed
2025-12-04T11:12:31.5286811Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv
2025-12-04T11:12:31.5286969Z GITHUB_REPOSITORY_OWNER_ID=21003710
2025-12-04T11:12:31.5287096Z AWS_DEFAULT_REGION=us-east-1
2025-12-04T11:12:31.5287233Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts
2025-12-04T11:12:31.5287381Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot
2025-12-04T11:12:31.5287507Z GITHUB_REF_TYPE=branch
2025-12-04T11:12:31.5287635Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5287908Z HUGGING_FACE_HUB_TOKEN=***
2025-12-04T11:12:31.5288308Z ***
2025-12-04T11:12:31.5288416Z GITHUB_REPOSITORY_ID=65600975
2025-12-04T11:12:31.5288536Z GITHUB_ACTIONS=true
2025-12-04T11:12:31.5288653Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5288802Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5289027Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic-rocm-mi300.yml@refs/heads/main
2025-12-04T11:12:31.5289234Z UCC_HOME=/usr
2025-12-04T11:12:31.5289343Z RUNNER_ENVIRONMENT=self-hosted
2025-12-04T11:12:31.5289587Z VERBOSE_TEST_LOGS=False
2025-12-04T11:12:31.5289700Z GITHUB_REF=refs/heads/main
2025-12-04T11:12:31.5289811Z RUNNER_OS=Linux
2025-12-04T11:12:31.5289908Z SHARD_NUMBER=2
2025-12-04T11:12:31.5290064Z GITHUB_REF_PROTECTED=true
2025-12-04T11:12:31.5290180Z RUNNER_MANUALLY_TRAP_SIG=1
2025-12-04T11:12:31.5290289Z HOME=/var/lib/jenkins
2025-12-04T11:12:31.5290406Z GITHUB_API_URL=https://api.github.com
2025-12-04T11:12:31.5290544Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0
2025-12-04T11:12:31.5290682Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs
2025-12-04T11:12:31.5290814Z LANG=C.UTF-8
2025-12-04T11:12:31.5290928Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e
2025-12-04T11:12:31.5291067Z PYTORCH_TEST_WITH_ROCM=1
2025-12-04T11:12:31.5291211Z RUNNER_TRACKING_ID=github_8890dc2f-279f-4663-b384-d74a6fcb36d4
2025-12-04T11:12:31.5291359Z RUNNER_ARCH=X64
2025-12-04T11:12:31.5291462Z RUNNER_TEMP=/home/runner/_work/_temp
2025-12-04T11:12:31.5291580Z NUM_TEST_SHARDS=3
2025-12-04T11:12:31.5291676Z UCX_HOME=/usr
2025-12-04T11:12:31.5291867Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5292214Z JOB_NAME=linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:12:31.5292516Z MAGMA_HOME=/opt/rocm/magma
2025-12-04T11:12:31.5292708Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5292950Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json
2025-12-04T11:12:31.5293112Z GITHUB_EVENT_NAME=schedule
2025-12-04T11:12:31.5293270Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1
2025-12-04T11:12:31.5293435Z DASHBOARD_TAG=
2025-12-04T11:12:31.5293532Z GITHUB_RUN_ID=19922798714
2025-12-04T11:12:31.5293741Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5293969Z GITHUB_ACTOR=pytorchmergebot
2025-12-04T11:12:31.5294080Z PR_NUMBER=
2025-12-04T11:12:31.5294173Z GITHUB_RUN_ATTEMPT=1
2025-12-04T11:12:31.5294280Z ANACONDA_PYTHON_VERSION=3.12
2025-12-04T11:12:31.5294415Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql
2025-12-04T11:12:31.5294552Z TERM=vt100
2025-12-04T11:12:31.5294641Z INSTALLED_VISION=yes
2025-12-04T11:12:31.5294740Z BRANCH=main
2025-12-04T11:12:31.5294838Z OPENSSL_ROOT_DIR=/opt/openssl
2025-12-04T11:12:31.5294948Z TESTS_TO_INCLUDE=
2025-12-04T11:12:31.5295106Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm
2025-12-04T11:12:31.5295296Z GITHUB_SERVER_URL=https://github.com
2025-12-04T11:12:31.5295434Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100
2025-12-04T11:12:31.5295585Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77
2025-12-04T11:12:31.5295718Z REENABLED_ISSUES=
2025-12-04T11:12:31.5295811Z SHLVL=1
2025-12-04T11:12:31.5295898Z MAX_JOBS=254
2025-12-04T11:12:31.5296032Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results
2025-12-04T11:12:31.5296185Z GITHUB_ACTOR_ID=97764156
2025-12-04T11:12:31.5296302Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool
2025-12-04T11:12:31.5296463Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5296615Z GITHUB_REF_NAME=main
2025-12-04T11:12:31.5296716Z ROCM_PATH=/opt/rocm
2025-12-04T11:12:31.5296811Z GITHUB_JOB=test
2025-12-04T11:12:31.5296909Z NO_TEST_TIMEOUT=False
2025-12-04T11:12:31.5297020Z GITHUB_REPOSITORY=pytorch/pytorch
2025-12-04T11:12:31.5297137Z LC_ALL=C.UTF-8
2025-12-04T11:12:31.5297233Z GITHUB_RETENTION_DAYS=90
2025-12-04T11:12:31.5297351Z RUNNER_WORKSPACE=/home/runner/_work/pytorch
2025-12-04T11:12:31.5297479Z OPENSSL_DIR=/opt/openssl
2025-12-04T11:12:31.5297590Z GITHUB_ACTION_REPOSITORY=
2025-12-04T11:12:31.5297944Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2025-12-04T11:12:31.5298388Z GITHUB_BASE_REF=
2025-12-04T11:12:31.5298481Z CI=true
2025-12-04T11:12:31.5298576Z GITHUB_REPOSITORY_OWNER=pytorch
2025-12-04T11:12:31.5298687Z JOB_ID=57117547540
2025-12-04T11:12:31.5298781Z GITHUB_HEAD_REF=
2025-12-04T11:12:31.5298877Z GITHUB_ACTION_REF=
2025-12-04T11:12:31.5298978Z TEST_SHOWLOCALS=False
2025-12-04T11:12:31.5299091Z GITHUB_WORKFLOW=periodic-rocm-mi300
2025-12-04T11:12:31.5299218Z DEBIAN_FRONTEND=noninteractive
2025-12-04T11:12:31.5299424Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5299628Z NO_TD=False
2025-12-04T11:12:31.5299720Z OLDPWD=/var/lib/jenkins
2025-12-04T11:12:31.5299821Z _=/usr/bin/env
2025-12-04T11:12:31.5299950Z ++ python -c 'import site; print(site.getsitepackages()[0])'
2025-12-04T11:12:31.5355563Z + TORCH_INSTALL_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch
2025-12-04T11:12:31.5356984Z + TORCH_BIN_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/bin
2025-12-04T11:12:31.5357300Z + TORCH_LIB_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/lib
2025-12-04T11:12:31.5357528Z + TORCH_TEST_DIR=/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/test
2025-12-04T11:12:31.5357707Z + BUILD_DIR=build
2025-12-04T11:12:31.5358301Z + BUILD_RENAMED_DIR=build_renamed
2025-12-04T11:12:31.5358437Z + BUILD_BIN_DIR=build/bin
2025-12-04T11:12:31.5358568Z + SHARD_NUMBER=2
2025-12-04T11:12:31.5358671Z + NUM_TEST_SHARDS=3
2025-12-04T11:12:31.5358785Z + export TORCH_SERIALIZATION_DEBUG=1
2025-12-04T11:12:31.5358920Z + TORCH_SERIALIZATION_DEBUG=1
2025-12-04T11:12:31.5359039Z + export VALGRIND=ON
2025-12-04T11:12:31.5359143Z + VALGRIND=ON
2025-12-04T11:12:31.5359263Z + [[ linux-noble-rocm-py3.12-mi300 == *clang9* ]]
2025-12-04T11:12:31.5359428Z + [[ linux-noble-rocm-py3.12-mi300 == *xpu* ]]
2025-12-04T11:12:31.5359564Z + detect_cuda_arch
2025-12-04T11:12:31.5359684Z + [[ linux-noble-rocm-py3.12-mi300 == *cuda* ]]
2025-12-04T11:12:31.5359838Z + [[ linux-noble-rocm-py3.12-mi300 == *s390x* ]]
2025-12-04T11:12:31.5359974Z + [[ 0 == \1 ]]
2025-12-04T11:12:31.5360075Z + [[ True == \1 ]]
2025-12-04T11:12:31.5360190Z + [[ linux-noble-rocm-py3.12-mi300 != *bazel* ]]
2025-12-04T11:12:31.5360340Z ++ realpath build/custom_test_artifacts
2025-12-04T11:12:31.5366040Z + CUSTOM_TEST_ARTIFACT_BUILD_DIR=/var/lib/jenkins/pytorch/build/custom_test_artifacts
2025-12-04T11:12:31.5366238Z + [[ -n '' ]]
2025-12-04T11:12:31.5366343Z + echo 'Environment variables'
2025-12-04T11:12:31.5366464Z Environment variables
2025-12-04T11:12:31.5366571Z + env
2025-12-04T11:12:31.5372328Z GITHUB_WORKSPACE=/home/runner/_work/pytorch/pytorch
2025-12-04T11:12:31.5372516Z CONTINUE_THROUGH_ERROR=True
2025-12-04T11:12:31.5372659Z BUILD_ENVIRONMENT=linux-noble-rocm-py3.12-mi300
2025-12-04T11:12:31.5372835Z HOSTNAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv
2025-12-04T11:12:31.5373085Z GITHUB_PATH=/home/runner/_work/_temp/_runner_file_commands/add_path_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5373302Z GITHUB_ACTION=__run_2
2025-12-04T11:12:31.5373428Z PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1
2025-12-04T11:12:31.5373551Z GITHUB_RUN_NUMBER=1861
2025-12-04T11:12:31.5373660Z TEST_CONFIG=distributed
2025-12-04T11:12:31.5373806Z RUNNER_NAME=linux.rocm.gpu.gfx942.4.b-bphpw-runner-rlsbv
2025-12-04T11:12:31.5373973Z GITHUB_REPOSITORY_OWNER_ID=21003710
2025-12-04T11:12:31.5374104Z AWS_DEFAULT_REGION=us-east-1
2025-12-04T11:12:31.5374246Z RUNNER_ARTIFACT_DIR=/home/runner/_work/_temp/artifacts
2025-12-04T11:12:31.5374401Z GITHUB_TRIGGERING_ACTOR=pytorchmergebot
2025-12-04T11:12:31.5374586Z GITHUB_REF_TYPE=branch
2025-12-04T11:12:31.5374715Z BASE_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5375028Z HUGGING_FACE_HUB_TOKEN=***
2025-12-04T11:12:31.5375220Z ***
2025-12-04T11:12:31.5375319Z GITHUB_REPOSITORY_ID=65600975
2025-12-04T11:12:31.5375438Z GITHUB_ACTIONS=true
2025-12-04T11:12:31.5375559Z SHA1=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5375877Z GITHUB_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5376108Z GITHUB_WORKFLOW_REF=pytorch/pytorch/.github/workflows/periodic-rocm-mi300.yml@refs/heads/main
2025-12-04T11:12:31.5376314Z UCC_HOME=/usr
2025-12-04T11:12:31.5376417Z TORCH_SERIALIZATION_DEBUG=1
2025-12-04T11:12:31.5376537Z RUNNER_ENVIRONMENT=self-hosted
2025-12-04T11:12:31.5376657Z VERBOSE_TEST_LOGS=False
2025-12-04T11:12:31.5376770Z GITHUB_REF=refs/heads/main
2025-12-04T11:12:31.5376879Z RUNNER_OS=Linux
2025-12-04T11:12:31.5376977Z SHARD_NUMBER=2
2025-12-04T11:12:31.5377080Z GITHUB_REF_PROTECTED=true
2025-12-04T11:12:31.5377198Z RUNNER_MANUALLY_TRAP_SIG=1
2025-12-04T11:12:31.5377312Z HOME=/var/lib/jenkins
2025-12-04T11:12:31.5377444Z GITHUB_API_URL=https://api.github.com
2025-12-04T11:12:31.5377586Z PYTORCH_TEST_RERUN_DISABLED_TESTS=0
2025-12-04T11:12:31.5377732Z RUNNER_DOCS_DIR=/home/runner/_work/_temp/docs
2025-12-04T11:12:31.5377866Z LANG=C.UTF-8
2025-12-04T11:12:31.5377985Z UCX_COMMIT=29831d319e6be55cb8c768ca61de335c934ca39e
2025-12-04T11:12:31.5378136Z PYTORCH_TEST_WITH_ROCM=1
2025-12-04T11:12:31.5378336Z RUNNER_TRACKING_ID=github_8890dc2f-279f-4663-b384-d74a6fcb36d4
2025-12-04T11:12:31.5378490Z RUNNER_ARCH=X64
2025-12-04T11:12:31.5378600Z RUNNER_TEMP=/home/runner/_work/_temp
2025-12-04T11:12:31.5378785Z NUM_TEST_SHARDS=3
2025-12-04T11:12:31.5378887Z UCX_HOME=/usr
2025-12-04T11:12:31.5379087Z GITHUB_STATE=/home/runner/_work/_temp/_runner_file_commands/save_state_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5379447Z JOB_NAME=linux-noble-rocm-py3.12-mi300 / test (distributed, 2, 3, linux.rocm.gpu.gfx942.4.b, module:rocm, oncall:distributed, mem_leak_check)
2025-12-04T11:12:31.5379711Z MAGMA_HOME=/opt/rocm/magma
2025-12-04T11:12:31.5379914Z GITHUB_ENV=/home/runner/_work/_temp/_runner_file_commands/set_env_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5380162Z GITHUB_EVENT_PATH=/home/runner/_work/_temp/_github_workflow/event.json
2025-12-04T11:12:31.5380331Z GITHUB_EVENT_NAME=schedule
2025-12-04T11:12:31.5380500Z GITHUB_ACTIONS_RUNNER_EXTRA_USER_AGENT=actions-runner-controller/0.12.1
2025-12-04T11:12:31.5380670Z DASHBOARD_TAG=
2025-12-04T11:12:31.5380774Z GITHUB_RUN_ID=19922798714
2025-12-04T11:12:31.5380993Z GITHUB_STEP_SUMMARY=/home/runner/_work/_temp/_runner_file_commands/step_summary_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5381229Z GITHUB_ACTOR=pytorchmergebot
2025-12-04T11:12:31.5381347Z PR_NUMBER=
2025-12-04T11:12:31.5381450Z GITHUB_RUN_ATTEMPT=1
2025-12-04T11:12:31.5381560Z VALGRIND=ON
2025-12-04T11:12:31.5381664Z ANACONDA_PYTHON_VERSION=3.12
2025-12-04T11:12:31.5381805Z GITHUB_GRAPHQL_URL=https://api.github.com/graphql
2025-12-04T11:12:31.5381944Z TERM=vt100
2025-12-04T11:12:31.5382035Z INSTALLED_VISION=yes
2025-12-04T11:12:31.5382139Z BRANCH=main
2025-12-04T11:12:31.5382239Z OPENSSL_ROOT_DIR=/opt/openssl
2025-12-04T11:12:31.5382358Z TESTS_TO_INCLUDE=
2025-12-04T11:12:31.5382523Z GITHUB_ACTION_PATH=/home/runner/_work/pytorch/pytorch/./.github/actions/setup-rocm
2025-12-04T11:12:31.5382719Z GITHUB_SERVER_URL=https://github.com
2025-12-04T11:12:31.5382862Z PYTORCH_ROCM_ARCH=gfx90a;gfx942;gfx950;gfx1100
2025-12-04T11:12:31.5383017Z UCC_COMMIT=9f4b242cbbd8b1462cbc732eb29316cdfa124b77
2025-12-04T11:12:31.5383156Z REENABLED_ISSUES=
2025-12-04T11:12:31.5383255Z SHLVL=1
2025-12-04T11:12:31.5383351Z MAX_JOBS=254
2025-12-04T11:12:31.5383487Z RUNNER_TEST_RESULTS_DIR=/home/runner/_work/_temp/test-results
2025-12-04T11:12:31.5383656Z GITHUB_ACTOR_ID=97764156
2025-12-04T11:12:31.5383779Z RUNNER_TOOL_CACHE=/home/runner/_work/_tool
2025-12-04T11:12:31.5383943Z GITHUB_WORKFLOW_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32
2025-12-04T11:12:31.5384099Z GITHUB_REF_NAME=main
2025-12-04T11:12:31.5384205Z ROCM_PATH=/opt/rocm
2025-12-04T11:12:31.5384308Z GITHUB_JOB=test
2025-12-04T11:12:31.5384411Z NO_TEST_TIMEOUT=False
2025-12-04T11:12:31.5384527Z GITHUB_REPOSITORY=pytorch/pytorch
2025-12-04T11:12:31.5384649Z LC_ALL=C.UTF-8
2025-12-04T11:12:31.5384753Z GITHUB_RETENTION_DAYS=90
2025-12-04T11:12:31.5384920Z RUNNER_WORKSPACE=/home/runner/_work/pytorch
2025-12-04T11:12:31.5385052Z OPENSSL_DIR=/opt/openssl
2025-12-04T11:12:31.5385160Z GITHUB_ACTION_REPOSITORY=
2025-12-04T11:12:31.5385516Z PATH=/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2025-12-04T11:12:31.5385868Z GITHUB_BASE_REF=
2025-12-04T11:12:31.5385961Z CI=true
2025-12-04T11:12:31.5386056Z GITHUB_REPOSITORY_OWNER=pytorch
2025-12-04T11:12:31.5386171Z JOB_ID=57117547540
2025-12-04T11:12:31.5386265Z GITHUB_HEAD_REF=
2025-12-04T11:12:31.5386361Z GITHUB_ACTION_REF=
2025-12-04T11:12:31.5386456Z TEST_SHOWLOCALS=False
2025-12-04T11:12:31.5386567Z GITHUB_WORKFLOW=periodic-rocm-mi300
2025-12-04T11:12:31.5386698Z DEBIAN_FRONTEND=noninteractive
2025-12-04T11:12:31.5386906Z GITHUB_OUTPUT=/home/runner/_work/_temp/_runner_file_commands/set_output_bf8906a4-0709-4e0b-99f3-66cba6f90f50
2025-12-04T11:12:31.5387111Z NO_TD=False
2025-12-04T11:12:31.5398440Z OLDPWD=/var/lib/jenkins
2025-12-04T11:12:31.5398557Z _=/usr/bin/env
2025-12-04T11:12:31.5398661Z + echo 'Testing pytorch'
2025-12-04T11:12:31.5398770Z Testing pytorch
2025-12-04T11:12:31.5398869Z + export LANG=C.UTF-8
2025-12-04T11:12:31.5399054Z + LANG=C.UTF-8
2025-12-04T11:12:31.5399147Z + PR_NUMBER=
2025-12-04T11:12:31.5399242Z + [[ distributed == \d\e\f\a\u\l\t ]]
2025-12-04T11:12:31.5399375Z + [[ distributed == \d\i\s\t\r\i\b\u\t\e\d ]]
2025-12-04T11:12:31.5399520Z + [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]]
2025-12-04T11:12:31.5399664Z + export HIP_VISIBLE_DEVICES=0,1,2,3
2025-12-04T11:12:31.5399795Z + HIP_VISIBLE_DEVICES=0,1,2,3
2025-12-04T11:12:31.5399917Z + [[ distributed == \s\l\o\w ]]
2025-12-04T11:12:31.5400063Z + [[ linux-noble-rocm-py3.12-mi300 == *slow-gradcheck* ]]
2025-12-04T11:12:31.5400227Z + [[ linux-noble-rocm-py3.12-mi300 == *cuda* ]]
2025-12-04T11:12:31.5400374Z + [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]]
2025-12-04T11:12:31.5400521Z + export PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda
2025-12-04T11:12:31.5400661Z + PYTORCH_TESTING_DEVICE_ONLY_FOR=cuda
2025-12-04T11:12:31.5400796Z + [[ distributed == *crossref* ]]
2025-12-04T11:12:31.5400928Z + [[ linux-noble-rocm-py3.12-mi300 == *rocm* ]]
2025-12-04T11:12:31.5401060Z + export VALGRIND=OFF
2025-12-04T11:12:31.5401171Z + VALGRIND=OFF
2025-12-04T11:12:31.5401266Z + rocminfo
2025-12-04T11:12:31.5494959Z [37mROCk module version 6.12.12 is loaded[0m
2025-12-04T11:12:31.6276415Z =====================    
2025-12-04T11:12:31.6276591Z HSA System Attributes    
2025-12-04T11:12:31.6276710Z =====================    
2025-12-04T11:12:31.6276823Z Runtime Version:         1.18
2025-12-04T11:12:31.6276942Z Runtime Ext Version:     1.14
2025-12-04T11:12:31.6277068Z System Timestamp Freq.:  1000.000000MHz
2025-12-04T11:12:31.6277266Z Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
2025-12-04T11:12:31.6277479Z Machine Model:           LARGE                              
2025-12-04T11:12:31.6277659Z System Endianness:       LITTLE                             
2025-12-04T11:12:31.6277806Z Mwaitx:                  DISABLED
2025-12-04T11:12:31.6277926Z XNACK enabled:           NO
2025-12-04T11:12:31.6278044Z DMAbuf Support:          YES
2025-12-04T11:12:31.6278352Z VMM Support:             YES
2025-12-04T11:12:31.6278432Z 
2025-12-04T11:12:31.6278474Z ==========               
2025-12-04T11:12:31.6278582Z HSA Agents               
2025-12-04T11:12:31.6278691Z ==========               
2025-12-04T11:12:31.6278805Z *******                  
2025-12-04T11:12:31.6278908Z Agent 1                  
2025-12-04T11:12:31.6279010Z *******                  
2025-12-04T11:12:31.6279142Z   Name:                    AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:12:31.6279299Z   Uuid:                    CPU-XX                             
2025-12-04T11:12:31.6279460Z   Marketing Name:          AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:12:31.6279715Z   Vendor Name:             CPU                                
2025-12-04T11:12:31.6279875Z   Feature:                 None specified                     
2025-12-04T11:12:31.6280036Z   Profile:                 FULL_PROFILE                       
2025-12-04T11:12:31.6280201Z   Float Round Mode:        NEAR                               
2025-12-04T11:12:31.6280384Z   Max Queue Number:        0(0x0)                             
2025-12-04T11:12:31.6280546Z   Queue Min Size:          0(0x0)                             
2025-12-04T11:12:31.6280710Z   Queue Max Size:          0(0x0)                             
2025-12-04T11:12:31.6280870Z   Queue Type:              MULTI                              
2025-12-04T11:12:31.6281032Z   Node:                    0                                  
2025-12-04T11:12:31.6281190Z   Device Type:             CPU                                
2025-12-04T11:12:31.6281342Z   Cache Info:              
2025-12-04T11:12:31.6281472Z     L1:                      49152(0xc000) KB                   
2025-12-04T11:12:31.6281629Z   Chip ID:                 0(0x0)                             
2025-12-04T11:12:31.6281791Z   ASIC Revision:           0(0x0)                             
2025-12-04T11:12:31.6281956Z   Cacheline Size:          64(0x40)                           
2025-12-04T11:12:31.6282120Z   Max Clock Freq. (MHz):   3300                               
2025-12-04T11:12:31.6282345Z   BDFID:                   0                                  
2025-12-04T11:12:31.6282511Z   Internal Node ID:        0                                  
2025-12-04T11:12:31.6282792Z   Compute Unit:            128                                
2025-12-04T11:12:31.6282955Z   SIMDs per CU:            0                                  
2025-12-04T11:12:31.6283115Z   Shader Engines:          0                                  
2025-12-04T11:12:31.6283282Z   Shader Arrs. per Eng.:   0                                  
2025-12-04T11:12:31.6283451Z   WatchPts on Addr. Ranges:1                                  
2025-12-04T11:12:31.6283608Z   Memory Properties:       
2025-12-04T11:12:31.6283728Z   Features:                None
2025-12-04T11:12:31.6283847Z   Pool Info:               
2025-12-04T11:12:31.6284006Z     Pool 1                   
2025-12-04T11:12:31.6284152Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:12:31.6284325Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:12:31.6284483Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6284650Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6284828Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6284999Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6285165Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6285309Z     Pool 2                   
2025-12-04T11:12:31.6285451Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:12:31.6285615Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:12:31.6285771Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6285936Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6286111Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6286283Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6286449Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6286596Z     Pool 3                   
2025-12-04T11:12:31.6286735Z       Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
2025-12-04T11:12:31.6286893Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:12:31.6287049Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6287212Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6287421Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6287595Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6287763Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6287914Z     Pool 4                   
2025-12-04T11:12:31.6288054Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:12:31.6288263Z       Size:                    1584755152(0x5e7571d0) KB          
2025-12-04T11:12:31.6288421Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6288585Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6288757Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6288927Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6289097Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6289247Z   ISA Info:                
2025-12-04T11:12:31.6289362Z *******                  
2025-12-04T11:12:31.6289471Z Agent 2                  
2025-12-04T11:12:31.6289581Z *******                  
2025-12-04T11:12:31.6289707Z   Name:                    AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:12:31.6289909Z   Uuid:                    CPU-XX                             
2025-12-04T11:12:31.6290073Z   Marketing Name:          AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:12:31.6290241Z   Vendor Name:             CPU                                
2025-12-04T11:12:31.6290407Z   Feature:                 None specified                     
2025-12-04T11:12:31.6290570Z   Profile:                 FULL_PROFILE                       
2025-12-04T11:12:31.6290731Z   Float Round Mode:        NEAR                               
2025-12-04T11:12:31.6290893Z   Max Queue Number:        0(0x0)                             
2025-12-04T11:12:31.6291057Z   Queue Min Size:          0(0x0)                             
2025-12-04T11:12:31.6291216Z   Queue Max Size:          0(0x0)                             
2025-12-04T11:12:31.6291375Z   Queue Type:              MULTI                              
2025-12-04T11:12:31.6291525Z   Node:                    1                                  
2025-12-04T11:12:31.6291682Z   Device Type:             CPU                                
2025-12-04T11:12:31.6291826Z   Cache Info:              
2025-12-04T11:12:31.6291949Z     L1:                      49152(0xc000) KB                   
2025-12-04T11:12:31.6292093Z   Chip ID:                 0(0x0)                             
2025-12-04T11:12:31.6292245Z   ASIC Revision:           0(0x0)                             
2025-12-04T11:12:31.6292405Z   Cacheline Size:          64(0x40)                           
2025-12-04T11:12:31.6292567Z   Max Clock Freq. (MHz):   3300                               
2025-12-04T11:12:31.6292722Z   BDFID:                   0                                  
2025-12-04T11:12:31.6292877Z   Internal Node ID:        1                                  
2025-12-04T11:12:31.6293037Z   Compute Unit:            128                                
2025-12-04T11:12:31.6293195Z   SIMDs per CU:            0                                  
2025-12-04T11:12:31.6293361Z   Shader Engines:          0                                  
2025-12-04T11:12:31.6293532Z   Shader Arrs. per Eng.:   0                                  
2025-12-04T11:12:31.6293700Z   WatchPts on Addr. Ranges:1                                  
2025-12-04T11:12:31.6293851Z   Memory Properties:       
2025-12-04T11:12:31.6293971Z   Features:                None
2025-12-04T11:12:31.6294088Z   Pool Info:               
2025-12-04T11:12:31.6294198Z     Pool 1                   
2025-12-04T11:12:31.6294337Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:12:31.6294499Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:12:31.6294703Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6294870Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6295042Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6295213Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6295387Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6295534Z     Pool 2                   
2025-12-04T11:12:31.6295676Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:12:31.6295834Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:12:31.6295993Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6296155Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6296321Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6296491Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6296653Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6296800Z     Pool 3                   
2025-12-04T11:12:31.6296931Z       Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
2025-12-04T11:12:31.6297117Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:12:31.6297271Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6297530Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6297696Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6297861Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6298041Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6298214Z     Pool 4                   
2025-12-04T11:12:31.6298353Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:12:31.6298508Z       Size:                    1585284308(0x5e7d84d4) KB          
2025-12-04T11:12:31.6298701Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6298918Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6299090Z       Alloc Recommended Granule:4KB                                
2025-12-04T11:12:31.6299260Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6299425Z       Accessible by all:       TRUE                               
2025-12-04T11:12:31.6299609Z   ISA Info:                
2025-12-04T11:12:31.6299715Z *******                  
2025-12-04T11:12:31.6299834Z Agent 3                  
2025-12-04T11:12:31.6299936Z *******                  
2025-12-04T11:12:31.6300050Z   Name:                    gfx942                             
2025-12-04T11:12:31.6300194Z   Uuid:                    GPU-e92b40ee81585045               
2025-12-04T11:12:31.6300354Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.6300517Z   Vendor Name:             AMD                                
2025-12-04T11:12:31.6300671Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:12:31.6300869Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:12:31.6301032Z   Float Round Mode:        NEAR                               
2025-12-04T11:12:31.6301192Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:12:31.6301349Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:12:31.6301503Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:12:31.6301668Z   Queue Type:              MULTI                              
2025-12-04T11:12:31.6301812Z   Node:                    2                                  
2025-12-04T11:12:31.6301963Z   Device Type:             GPU                                
2025-12-04T11:12:31.6302100Z   Cache Info:              
2025-12-04T11:12:31.6302274Z     L1:                      32(0x20) KB                        
2025-12-04T11:12:31.6302409Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:12:31.6302544Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:12:31.6302685Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:12:31.6302835Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:12:31.6303006Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:12:31.6303162Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:12:31.6303371Z   BDFID:                   62720                              
2025-12-04T11:12:31.6303556Z   Internal Node ID:        2                                  
2025-12-04T11:12:31.6303721Z   Compute Unit:            304                                
2025-12-04T11:12:31.6303954Z   SIMDs per CU:            4                                  
2025-12-04T11:12:31.6304116Z   Shader Engines:          32                                 
2025-12-04T11:12:31.6304278Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:12:31.6304483Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:12:31.6304788Z   Coherent Host Access:    FALSE                              
2025-12-04T11:12:31.6304938Z   Memory Properties:       
2025-12-04T11:12:31.6305062Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:12:31.6305225Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:12:31.6305452Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:12:31.6305657Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6305803Z   Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6305938Z     x                        1024(0x400)                        
2025-12-04T11:12:31.6306072Z     y                        1024(0x400)                        
2025-12-04T11:12:31.6306233Z     z                        1024(0x400)                        
2025-12-04T11:12:31.6306377Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:12:31.6306535Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:12:31.6306718Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6306890Z   Grid Max Size per Dimension:
2025-12-04T11:12:31.6307009Z     x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6307142Z     y                        65535(0xffff)                      
2025-12-04T11:12:31.6307283Z     z                        65535(0xffff)                      
2025-12-04T11:12:31.6307431Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:12:31.6307650Z   Packet Processor uCode:: 185                                
2025-12-04T11:12:31.6307820Z   SDMA engine uCode::      24                                 
2025-12-04T11:12:31.6307984Z   IOMMU Support::          None                               
2025-12-04T11:12:31.6308125Z   Pool Info:               
2025-12-04T11:12:31.6308316Z     Pool 1                   
2025-12-04T11:12:31.6308462Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:12:31.6308631Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6308794Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6308963Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6309139Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6309318Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6309492Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6309642Z     Pool 2                   
2025-12-04T11:12:31.6309785Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:12:31.6310071Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6310245Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6310412Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6310635Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6310807Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6311194Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6311357Z     Pool 3                   
2025-12-04T11:12:31.6311496Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:12:31.6311660Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6311819Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6312089Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6312273Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6312498Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6312669Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6312818Z     Pool 4                   
2025-12-04T11:12:31.6313029Z       Segment:                 GROUP                              
2025-12-04T11:12:31.6313184Z       Size:                    64(0x40) KB                        
2025-12-04T11:12:31.6313349Z       Allocatable:             FALSE                              
2025-12-04T11:12:31.6313548Z       Alloc Granule:           0KB                                
2025-12-04T11:12:31.6313737Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:12:31.6313910Z       Alloc Alignment:         0KB                                
2025-12-04T11:12:31.6314079Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6314228Z   ISA Info:                
2025-12-04T11:12:31.6314355Z     ISA 1                    
2025-12-04T11:12:31.6314503Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:12:31.6314685Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6314902Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6315085Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6315337Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6315548Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6315713Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6315869Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6316015Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6316160Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6316305Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6316459Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6316612Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6316806Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6316991Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6317134Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6317292Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6317449Z     ISA 2                    
2025-12-04T11:12:31.6317604Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:12:31.6317792Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6317967Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6318238Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6318414Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6318580Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6318794Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6318979Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6319118Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6319265Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6319418Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6319572Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6319723Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6319857Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6320004Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6320145Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6320301Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6320448Z *******                  
2025-12-04T11:12:31.6320666Z Agent 4                  
2025-12-04T11:12:31.6320770Z *******                  
2025-12-04T11:12:31.6320895Z   Name:                    gfx942                             
2025-12-04T11:12:31.6321084Z   Uuid:                    GPU-0f23c118dd1bca7f               
2025-12-04T11:12:31.6321250Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.6321441Z   Vendor Name:             AMD                                
2025-12-04T11:12:31.6321605Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:12:31.6321768Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:12:31.6321938Z   Float Round Mode:        NEAR                               
2025-12-04T11:12:31.6322116Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:12:31.6322279Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:12:31.6322450Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:12:31.6322615Z   Queue Type:              MULTI                              
2025-12-04T11:12:31.6322799Z   Node:                    3                                  
2025-12-04T11:12:31.6322953Z   Device Type:             GPU                                
2025-12-04T11:12:31.6323096Z   Cache Info:              
2025-12-04T11:12:31.6323230Z     L1:                      32(0x20) KB                        
2025-12-04T11:12:31.6323378Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:12:31.6323520Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:12:31.6323668Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:12:31.6323828Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:12:31.6323992Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:12:31.6324158Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:12:31.6324321Z   BDFID:                   34048                              
2025-12-04T11:12:31.6324480Z   Internal Node ID:        3                                  
2025-12-04T11:12:31.6324645Z   Compute Unit:            304                                
2025-12-04T11:12:31.6324806Z   SIMDs per CU:            4                                  
2025-12-04T11:12:31.6324970Z   Shader Engines:          32                                 
2025-12-04T11:12:31.6325139Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:12:31.6325310Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:12:31.6325483Z   Coherent Host Access:    FALSE                              
2025-12-04T11:12:31.6325721Z   Memory Properties:       
2025-12-04T11:12:31.6325848Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:12:31.6326002Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:12:31.6326169Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:12:31.6326340Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6326497Z   Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6326635Z     x                        1024(0x400)                        
2025-12-04T11:12:31.6326776Z     y                        1024(0x400)                        
2025-12-04T11:12:31.6326914Z     z                        1024(0x400)                        
2025-12-04T11:12:31.6327065Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:12:31.6327233Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:12:31.6327399Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6327554Z   Grid Max Size per Dimension:
2025-12-04T11:12:31.6327683Z     x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6327824Z     y                        65535(0xffff)                      
2025-12-04T11:12:31.6327964Z     z                        65535(0xffff)                      
2025-12-04T11:12:31.6328240Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:12:31.6328417Z   Packet Processor uCode:: 185                                
2025-12-04T11:12:31.6328590Z   SDMA engine uCode::      24                                 
2025-12-04T11:12:31.6328757Z   IOMMU Support::          None                               
2025-12-04T11:12:31.6328904Z   Pool Info:               
2025-12-04T11:12:31.6329020Z     Pool 1                   
2025-12-04T11:12:31.6329166Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:12:31.6329329Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6329497Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6329666Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6329843Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6330024Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6330196Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6330346Z     Pool 2                   
2025-12-04T11:12:31.6330489Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:12:31.6330651Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6330811Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6330977Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6331152Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6331330Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6331501Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6331652Z     Pool 3                   
2025-12-04T11:12:31.6331792Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:12:31.6331956Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6332115Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6332282Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6332451Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6332625Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6332795Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6332944Z     Pool 4                   
2025-12-04T11:12:31.6333134Z       Segment:                 GROUP                              
2025-12-04T11:12:31.6333289Z       Size:                    64(0x40) KB                        
2025-12-04T11:12:31.6333446Z       Allocatable:             FALSE                              
2025-12-04T11:12:31.6333613Z       Alloc Granule:           0KB                                
2025-12-04T11:12:31.6333791Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:12:31.6333965Z       Alloc Alignment:         0KB                                
2025-12-04T11:12:31.6334134Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6334284Z   ISA Info:                
2025-12-04T11:12:31.6334400Z     ISA 1                    
2025-12-04T11:12:31.6334543Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:12:31.6334714Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6334883Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6335049Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6335218Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6335376Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6335578Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6335726Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6335857Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6335991Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6336126Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6336271Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6336413Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6336536Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6336675Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6336809Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6336958Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6337101Z     ISA 2                    
2025-12-04T11:12:31.6337248Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:12:31.6337426Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6337593Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6337757Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6337925Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6338081Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6338278Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6338429Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6338561Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6338694Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6338833Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6338980Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6339121Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6339248Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6339380Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6339513Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6339662Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6339800Z *******                  
2025-12-04T11:12:31.6339903Z Agent 5                  
2025-12-04T11:12:31.6340048Z *******                  
2025-12-04T11:12:31.6340166Z   Name:                    gfx942                             
2025-12-04T11:12:31.6340311Z   Uuid:                    GPU-1385052698a87313               
2025-12-04T11:12:31.6340473Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.6340638Z   Vendor Name:             AMD                                
2025-12-04T11:12:31.6340797Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:12:31.6340950Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:12:31.6341106Z   Float Round Mode:        NEAR                               
2025-12-04T11:12:31.6341263Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:12:31.6341418Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:12:31.6341570Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:12:31.6341726Z   Queue Type:              MULTI                              
2025-12-04T11:12:31.6341871Z   Node:                    4                                  
2025-12-04T11:12:31.6342016Z   Device Type:             GPU                                
2025-12-04T11:12:31.6342152Z   Cache Info:              
2025-12-04T11:12:31.6342313Z     L1:                      32(0x20) KB                        
2025-12-04T11:12:31.6342447Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:12:31.6342578Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:12:31.6342716Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:12:31.6342864Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:12:31.6343021Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:12:31.6343179Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:12:31.6343328Z   BDFID:                   58624                              
2025-12-04T11:12:31.6343482Z   Internal Node ID:        4                                  
2025-12-04T11:12:31.6343640Z   Compute Unit:            304                                
2025-12-04T11:12:31.6343791Z   SIMDs per CU:            4                                  
2025-12-04T11:12:31.6343954Z   Shader Engines:          32                                 
2025-12-04T11:12:31.6344115Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:12:31.6344278Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:12:31.6344442Z   Coherent Host Access:    FALSE                              
2025-12-04T11:12:31.6344586Z   Memory Properties:       
2025-12-04T11:12:31.6344705Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:12:31.6344850Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:12:31.6345010Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:12:31.6345170Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6345317Z   Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6345447Z     x                        1024(0x400)                        
2025-12-04T11:12:31.6345581Z     y                        1024(0x400)                        
2025-12-04T11:12:31.6345720Z     z                        1024(0x400)                        
2025-12-04T11:12:31.6345863Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:12:31.6346023Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:12:31.6346182Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6346323Z   Grid Max Size per Dimension:
2025-12-04T11:12:31.6346444Z     x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6346576Z     y                        65535(0xffff)                      
2025-12-04T11:12:31.6346706Z     z                        65535(0xffff)                      
2025-12-04T11:12:31.6346931Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:12:31.6347101Z   Packet Processor uCode:: 185                                
2025-12-04T11:12:31.6347265Z   SDMA engine uCode::      24                                 
2025-12-04T11:12:31.6347428Z   IOMMU Support::          None                               
2025-12-04T11:12:31.6347575Z   Pool Info:               
2025-12-04T11:12:31.6347686Z     Pool 1                   
2025-12-04T11:12:31.6347820Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:12:31.6347973Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6348126Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6348326Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6348492Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6348660Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6348826Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6348965Z     Pool 2                   
2025-12-04T11:12:31.6349103Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:12:31.6349264Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6349454Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6349614Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6349787Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6349958Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6350127Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6350273Z     Pool 3                   
2025-12-04T11:12:31.6350409Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:12:31.6350572Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6350728Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6350892Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6351065Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6351241Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6351409Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6351556Z     Pool 4                   
2025-12-04T11:12:31.6351689Z       Segment:                 GROUP                              
2025-12-04T11:12:31.6351839Z       Size:                    64(0x40) KB                        
2025-12-04T11:12:31.6351994Z       Allocatable:             FALSE                              
2025-12-04T11:12:31.6352156Z       Alloc Granule:           0KB                                
2025-12-04T11:12:31.6352329Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:12:31.6352499Z       Alloc Alignment:         0KB                                
2025-12-04T11:12:31.6352666Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6352813Z   ISA Info:                
2025-12-04T11:12:31.6352934Z     ISA 1                    
2025-12-04T11:12:31.6353076Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:12:31.6353252Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6353422Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6353592Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6353765Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6353927Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6354088Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6354383Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6354520Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6354659Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6354797Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6354947Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6355094Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6355222Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6355361Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6355497Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6355648Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6355791Z     ISA 2                    
2025-12-04T11:12:31.6355948Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:12:31.6356131Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6356301Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6356515Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6356687Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6356849Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6357009Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6357161Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6357297Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6357435Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6357572Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6357727Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6357873Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6358003Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6358183Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6358328Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6358481Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6358625Z *******                  
2025-12-04T11:12:31.6358736Z Agent 6                  
2025-12-04T11:12:31.6358840Z *******                  
2025-12-04T11:12:31.6358962Z   Name:                    gfx942                             
2025-12-04T11:12:31.6359115Z   Uuid:                    GPU-7b47bcc6019ee30a               
2025-12-04T11:12:31.6359278Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.6359447Z   Vendor Name:             AMD                                
2025-12-04T11:12:31.6359607Z   Feature:                 KERNEL_DISPATCH                    
2025-12-04T11:12:31.6359767Z   Profile:                 BASE_PROFILE                       
2025-12-04T11:12:31.6359929Z   Float Round Mode:        NEAR                               
2025-12-04T11:12:31.6360099Z   Max Queue Number:        128(0x80)                          
2025-12-04T11:12:31.6360260Z   Queue Min Size:          64(0x40)                           
2025-12-04T11:12:31.6360419Z   Queue Max Size:          131072(0x20000)                    
2025-12-04T11:12:31.6360578Z   Queue Type:              MULTI                              
2025-12-04T11:12:31.6360729Z   Node:                    5                                  
2025-12-04T11:12:31.6360880Z   Device Type:             GPU                                
2025-12-04T11:12:31.6361021Z   Cache Info:              
2025-12-04T11:12:31.6361145Z     L1:                      32(0x20) KB                        
2025-12-04T11:12:31.6361330Z     L2:                      4096(0x1000) KB                    
2025-12-04T11:12:31.6361468Z     L3:                      262144(0x40000) KB                 
2025-12-04T11:12:31.6361612Z   Chip ID:                 29861(0x74a5)                      
2025-12-04T11:12:31.6361766Z   ASIC Revision:           1(0x1)                             
2025-12-04T11:12:31.6361925Z   Cacheline Size:          128(0x80)                          
2025-12-04T11:12:31.6362085Z   Max Clock Freq. (MHz):   2100                               
2025-12-04T11:12:31.6362237Z   BDFID:                   38144                              
2025-12-04T11:12:31.6362392Z   Internal Node ID:        5                                  
2025-12-04T11:12:31.6362546Z   Compute Unit:            304                                
2025-12-04T11:12:31.6362696Z   SIMDs per CU:            4                                  
2025-12-04T11:12:31.6362849Z   Shader Engines:          32                                 
2025-12-04T11:12:31.6363016Z   Shader Arrs. per Eng.:   1                                  
2025-12-04T11:12:31.6363177Z   WatchPts on Addr. Ranges:4                                  
2025-12-04T11:12:31.6363342Z   Coherent Host Access:    FALSE                              
2025-12-04T11:12:31.6363534Z   Memory Properties:       
2025-12-04T11:12:31.6363653Z   Features:                KERNEL_DISPATCH 
2025-12-04T11:12:31.6363799Z   Fast F16 Operation:      TRUE                               
2025-12-04T11:12:31.6363958Z   Wavefront Size:          64(0x40)                           
2025-12-04T11:12:31.6364121Z   Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6364271Z   Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6364404Z     x                        1024(0x400)                        
2025-12-04T11:12:31.6364545Z     y                        1024(0x400)                        
2025-12-04T11:12:31.6364681Z     z                        1024(0x400)                        
2025-12-04T11:12:31.6364827Z   Max Waves Per CU:        32(0x20)                           
2025-12-04T11:12:31.6364985Z   Max Work-item Per CU:    2048(0x800)                        
2025-12-04T11:12:31.6365147Z   Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6365297Z   Grid Max Size per Dimension:
2025-12-04T11:12:31.6365420Z     x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6365558Z     y                        65535(0xffff)                      
2025-12-04T11:12:31.6365693Z     z                        65535(0xffff)                      
2025-12-04T11:12:31.6365848Z   Max fbarriers/Workgrp:   32                                 
2025-12-04T11:12:31.6366022Z   Packet Processor uCode:: 185                                
2025-12-04T11:12:31.6366192Z   SDMA engine uCode::      24                                 
2025-12-04T11:12:31.6366357Z   IOMMU Support::          None                               
2025-12-04T11:12:31.6366501Z   Pool Info:               
2025-12-04T11:12:31.6366617Z     Pool 1                   
2025-12-04T11:12:31.6366759Z       Segment:                 GLOBAL; FLAGS: COARSE GRAINED      
2025-12-04T11:12:31.6366919Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6367083Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6367253Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6367428Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6367600Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6367767Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6367913Z     Pool 2                   
2025-12-04T11:12:31.6368053Z       Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
2025-12-04T11:12:31.6368251Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6368455Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6368618Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6368788Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6368961Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6369130Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6369275Z     Pool 3                   
2025-12-04T11:12:31.6369413Z       Segment:                 GLOBAL; FLAGS: FINE GRAINED        
2025-12-04T11:12:31.6369569Z       Size:                    268419072(0xfffc000) KB            
2025-12-04T11:12:31.6369726Z       Allocatable:             TRUE                               
2025-12-04T11:12:31.6369890Z       Alloc Granule:           4KB                                
2025-12-04T11:12:31.6370064Z       Alloc Recommended Granule:2048KB                             
2025-12-04T11:12:31.6370239Z       Alloc Alignment:         4KB                                
2025-12-04T11:12:31.6370407Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6370552Z     Pool 4                   
2025-12-04T11:12:31.6370686Z       Segment:                 GROUP                              
2025-12-04T11:12:31.6370878Z       Size:                    64(0x40) KB                        
2025-12-04T11:12:31.6371032Z       Allocatable:             FALSE                              
2025-12-04T11:12:31.6371196Z       Alloc Granule:           0KB                                
2025-12-04T11:12:31.6371366Z       Alloc Recommended Granule:0KB                                
2025-12-04T11:12:31.6371537Z       Alloc Alignment:         0KB                                
2025-12-04T11:12:31.6371707Z       Accessible by all:       FALSE                              
2025-12-04T11:12:31.6371855Z   ISA Info:                
2025-12-04T11:12:31.6371967Z     ISA 1                    
2025-12-04T11:12:31.6372108Z       Name:                    amdgcn-amd-amdhsa--gfx942:sramecc+:xnack-
2025-12-04T11:12:31.6372282Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6372452Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6372630Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6372802Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6372965Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6373126Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6373278Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6373415Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6373555Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6373693Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6373848Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6373996Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6374131Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6374275Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6374412Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6374566Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6374711Z     ISA 2                    
2025-12-04T11:12:31.6374858Z       Name:                    amdgcn-amd-amdhsa--gfx9-4-generic:sramecc+:xnack-
2025-12-04T11:12:31.6375040Z       Machine Models:          HSA_MACHINE_MODEL_LARGE            
2025-12-04T11:12:31.6375209Z       Profiles:                HSA_PROFILE_BASE                   
2025-12-04T11:12:31.6375377Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6375578Z       Default Rounding Mode:   NEAR                               
2025-12-04T11:12:31.6375742Z       Fast f16:                TRUE                               
2025-12-04T11:12:31.6375904Z       Workgroup Max Size:      1024(0x400)                        
2025-12-04T11:12:31.6376059Z       Workgroup Max Size per Dimension:
2025-12-04T11:12:31.6376195Z         x                        1024(0x400)                        
2025-12-04T11:12:31.6376335Z         y                        1024(0x400)                        
2025-12-04T11:12:31.6376468Z         z                        1024(0x400)                        
2025-12-04T11:12:31.6376618Z       Grid Max Size:           4294967295(0xffffffff)             
2025-12-04T11:12:31.6376764Z       Grid Max Size per Dimension:
2025-12-04T11:12:31.6376888Z         x                        2147483647(0x7fffffff)             
2025-12-04T11:12:31.6377023Z         y                        65535(0xffff)                      
2025-12-04T11:12:31.6377164Z         z                        65535(0xffff)                      
2025-12-04T11:12:31.6377320Z       FBarrier Max Size:       32                                 
2025-12-04T11:12:31.6377459Z *** Done ***             
2025-12-04T11:12:31.6385285Z + rocminfo
2025-12-04T11:12:31.6387736Z + grep -E 'Name:.*\sgfx|Marketing'
2025-12-04T11:12:31.7230121Z   Marketing Name:          AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:12:31.7230514Z   Marketing Name:          AMD EPYC 9575F 64-Core Processor   
2025-12-04T11:12:31.7230887Z   Name:                    gfx942                             
2025-12-04T11:12:31.7231179Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.7231505Z   Name:                    gfx942                             
2025-12-04T11:12:31.7231859Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.7232146Z   Name:                    gfx942                             
2025-12-04T11:12:31.7232514Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.7232853Z   Name:                    gfx942                             
2025-12-04T11:12:31.7233131Z   Marketing Name:          AMD Radeon Graphics                
2025-12-04T11:12:31.7336023Z + MAYBE_ROCM=rocm/
2025-12-04T11:12:31.7336271Z + [[ linux-noble-rocm-py3.12-mi300 == *xpu* ]]
2025-12-04T11:12:31.7336543Z + [[ linux-noble-rocm-py3.12-mi300 != *-bazel-* ]]
2025-12-04T11:12:31.7351507Z + pip_install ninja==1.10.2
2025-12-04T11:12:31.7351760Z + pip_install_pkg='python3 -m pip install --progress-bar off'
2025-12-04T11:12:31.7352053Z + python3 -m pip install --progress-bar off ninja==1.10.2
2025-12-04T11:12:31.9532767Z Collecting ninja==1.10.2
2025-12-04T11:12:31.9810204Z   Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl.metadata (5.0 kB)
2025-12-04T11:12:31.9917550Z Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB)
2025-12-04T11:12:32.0902101Z Installing collected packages: ninja
2025-12-04T11:12:32.0902414Z   Attempting uninstall: ninja
2025-12-04T11:12:32.0913681Z     Found existing installation: ninja 1.11.1.4
2025-12-04T11:12:32.0923884Z     Uninstalling ninja-1.11.1.4:
2025-12-04T11:12:32.0995586Z       Successfully uninstalled ninja-1.11.1.4
2025-12-04T11:12:32.1084511Z Successfully installed ninja-1.10.2
2025-12-04T11:12:32.1495501Z + export PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2025-12-04T11:12:32.1496393Z + PATH=/var/lib/jenkins/.local/bin:/opt/cache/bin:/opt/rocm/llvm/bin:/opt/rocm/opencl/bin:/opt/rocm/hip/bin:/opt/rocm/hcc/bin:/opt/rocm/bin:/opt/conda/envs/py_3.12/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2025-12-04T11:12:32.1496911Z + [[ linux-noble-rocm-py3.12-mi300 == *aarch64* ]]
2025-12-04T11:12:32.1497483Z + [[ linux-noble-rocm-py3.12-mi300 == *asan* ]]
2025-12-04T11:12:32.1497755Z + [[ linux-noble-rocm-py3.12-mi300 == *-debug* ]]
2025-12-04T11:12:32.1497944Z + [[ linux-noble-rocm-py3.12-mi300 != *-bazel-* ]]
2025-12-04T11:12:32.1498308Z + echo 'We are not in debug mode: linux-noble-rocm-py3.12-mi300. Expect the assertion to pass'
2025-12-04T11:12:32.1498680Z We are not in debug mode: linux-noble-rocm-py3.12-mi300. Expect the assertion to pass
2025-12-04T11:12:32.1498913Z + cd test
2025-12-04T11:12:32.1499095Z + python -c 'import torch; torch._C._crash_if_debug_asserts_fail(424242)'
2025-12-04T11:12:33.1054026Z + [[ distributed == \n\o\g\p\u\_\N\O\_\A\V\X\2 ]]
2025-12-04T11:12:33.1054507Z + [[ distributed == \n\o\g\p\u\_\A\V\X\5\1\2 ]]
2025-12-04T11:12:33.1054954Z + [[ distributed == \l\e\g\a\c\y\_\n\v\i\d\i\a\_\d\r\i\v\e\r ]]
2025-12-04T11:12:33.1058615Z + DYNAMO_BENCHMARK_FLAGS=()
2025-12-04T11:12:33.1059039Z + [[ distributed == *pr_time_benchmarks* ]]
2025-12-04T11:12:33.1059426Z + [[ distributed == *dynamo_eager* ]]
2025-12-04T11:12:33.1059814Z + [[ distributed == *aot_eager* ]]
2025-12-04T11:12:33.1060161Z + [[ distributed == *aot_inductor* ]]
2025-12-04T11:12:33.1060523Z + [[ distributed == *max_autotune_inductor* ]]
2025-12-04T11:12:33.1060887Z + [[ distributed == *inductor* ]]
2025-12-04T11:12:33.1061627Z + [[ distributed == *dynamic* ]]
2025-12-04T11:12:33.1061954Z + [[ distributed == *cpu* ]]
2025-12-04T11:12:33.1062255Z + [[ distributed == *xpu* ]]
2025-12-04T11:12:33.1062546Z + DYNAMO_BENCHMARK_FLAGS+=(--device cuda)
2025-12-04T11:12:33.1077783Z + [[ linux-noble-rocm-py3.12-mi300 == *libtorch* ]]
2025-12-04T11:12:33.1078041Z + [[ linux-noble-rocm-py3.12-mi300 == *-bazel-* ]]
2025-12-04T11:12:33.1081392Z + cd test
2025-12-04T11:12:33.1081591Z + python -c 'import torch; print(torch.__config__.show())'
2025-12-04T11:12:33.9106865Z PyTorch built with:
2025-12-04T11:12:33.9107173Z   - GCC 11.5
2025-12-04T11:12:33.9107342Z   - C++ Version: 201703
2025-12-04T11:12:33.9107775Z   - Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications
2025-12-04T11:12:33.9108255Z   - Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d)
2025-12-04T11:12:33.9108534Z   - OpenMP 201511 (a.k.a. OpenMP 4.5)
2025-12-04T11:12:33.9108748Z   - LAPACK is enabled (usually provided by MKL)
2025-12-04T11:12:33.9108977Z   - NNPACK is enabled
2025-12-04T11:12:33.9109150Z   - CPU capability usage: AVX512
2025-12-04T11:12:33.9109340Z   - HIP Runtime 7.1.25424
2025-12-04T11:12:33.9109505Z   - MIOpen 3.5.1
2025-12-04T11:12:33.9109648Z   - Magma 2.9.0
2025-12-04T11:12:33.9112112Z   - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, COMMIT_SHA=ffd9b0fb4355e97af82fc42cf185c3ffa0fc0a32, CXX_COMPILER=/opt/cache/bin/c++, CXX_FLAGS= -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DLIBKINETO_NOXPUPTI=ON -DUSE_FBGEMM -DUSE_FBGEMM_GENAI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, TORCH_VERSION=2.10.0, USE_CUDA=OFF, USE_CUDNN=OFF, USE_CUSPARSELT=OFF, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=ON, USE_ROCM_KERNEL_ASSERT=OFF, USE_XCCL=OFF, USE_XPU=OFF, 
2025-12-04T11:12:33.9114500Z 
2025-12-04T11:12:34.1810504Z + cd test
2025-12-04T11:12:34.1810894Z + python -c 'import torch; print(torch.__config__.parallel_info())'
2025-12-04T11:12:34.8973669Z ATen/Parallel:
2025-12-04T11:12:34.8974049Z 	at::get_num_threads() : 128
2025-12-04T11:12:34.8974895Z 	at::get_num_interop_threads() : 128
2025-12-04T11:12:34.8975158Z OpenMP 201511 (a.k.a. OpenMP 4.5)
2025-12-04T11:12:34.8975408Z 	omp_get_max_threads() : 128
2025-12-04T11:12:34.8975855Z Intel(R) oneAPI Math Kernel Library Version 2024.2-Product Build 20240605 for Intel(R) 64 architecture applications
2025-12-04T11:12:34.8976311Z 	mkl_get_max_threads() : 128
2025-12-04T11:12:34.8976623Z Intel(R) MKL-DNN v3.7.1 (Git Hash 8d263e693366ef8db40acc569cc7d8edf644556d)
2025-12-04T11:12:34.8976967Z std::thread::hardware_concurrency() : 256
2025-12-04T11:12:34.8977215Z Environment variables:
2025-12-04T11:12:34.8977445Z 	OMP_NUM_THREADS : [not set]
2025-12-04T11:12:34.8977661Z 	MKL_NUM_THREADS : [not set]
2025-12-04T11:12:34.8977886Z ATen parallel backend: OpenMP
2025-12-04T11:12:34.8978032Z 
2025-12-04T11:12:35.1440704Z + [[ distributed == *numpy_2* ]]
2025-12-04T11:12:35.1441012Z + [[ linux-noble-rocm-py3.12-mi300 == *aarch64* ]]
2025-12-04T11:12:35.1441285Z + [[ distributed == *backward* ]]
2025-12-04T11:12:35.1441541Z + [[ distributed == *libtorch_agnostic_targetting* ]]
2025-12-04T11:12:35.1441791Z + [[ distributed == *xla* ]]
2025-12-04T11:12:35.1441992Z + [[ distributed == *vllm* ]]
2025-12-04T11:12:35.1442199Z + [[ distributed == *executorch* ]]
2025-12-04T11:12:35.1442429Z + [[ distributed == \j\i\t\_\l\e\g\a\c\y ]]
2025-12-04T11:12:35.1442944Z + [[ distributed == \q\u\a\n\t\i\z\a\t\i\o\n ]]
2025-12-04T11:12:35.1443214Z + [[ linux-noble-rocm-py3.12-mi300 == *libtorch* ]]
2025-12-04T11:12:35.1443466Z + [[ distributed == distributed ]]
2025-12-04T11:12:35.1443672Z + test_distributed
2025-12-04T11:12:35.1443866Z + echo 'Testing distributed python tests'
2025-12-04T11:12:35.1444100Z Testing distributed python tests
2025-12-04T11:12:35.1444383Z + python test/run_test.py --distributed-tests --shard 2 3 --verbose
2025-12-04T11:12:36.9250016Z Excluding distributed/rpc/test_faulty_agent on ROCm
2025-12-04T11:12:36.9250458Z Excluding distributed/rpc/test_tensorpipe_agent on ROCm
2025-12-04T11:12:36.9250844Z Excluding distributed/rpc/test_share_memory on ROCm
2025-12-04T11:12:36.9251199Z Excluding distributed/rpc/cuda/test_tensorpipe_agent on ROCm
2025-12-04T11:12:37.8489565Z Downloading https://ossci-metrics.s3.amazonaws.com/disabled-tests-condensed.json to /var/lib/jenkins/pytorch/test/.pytorch-disabled-tests.json
2025-12-04T11:12:38.1845385Z Ignoring disabled issues:  ['']
2025-12-04T11:12:38.1894510Z Found test times from artifacts
2025-12-04T11:12:38.2067316Z Found test times from artifacts
2025-12-04T11:12:38.2072699Z Running all tests
2025-12-04T11:12:38.2118983Z Running parallel tests on 1 processes
2025-12-04T11:12:38.2119615Z Name: tests to run (est. time: 120.12min)
2025-12-04T11:12:38.2119950Z   Serial tests (74):
2025-12-04T11:12:38.2120223Z     distributed/test_dynamo_distributed 1/1
2025-12-04T11:12:38.2120535Z     distributed/pipelining/test_backward 1/1
2025-12-04T11:12:38.2120827Z     distributed/tensor/test_dtensor 1/1
2025-12-04T11:12:38.2121112Z     distributed/tensor/test_redistribute 2/2
2025-12-04T11:12:38.2121433Z     distributed/tensor/test_xla_integration 1/1
2025-12-04T11:12:38.2121762Z     distributed/checkpoint/_experimental/test_types 1/1
2025-12-04T11:12:38.2122145Z     distributed/tensor/experimental/test_register_sharding 1/1
2025-12-04T11:12:38.2122495Z     distributed/tensor/test_tensor_ops 1/1
2025-12-04T11:12:38.2122809Z     distributed/checkpoint/fsdp/test_fsdp_dsd 1/1
2025-12-04T11:12:38.2123144Z     distributed/tensor/debug/test_comm_mode_features 1/1
2025-12-04T11:12:38.2123463Z     distributed/tensor/test_dtensor_ops 1/1
2025-12-04T11:12:38.2123740Z     distributed/tensor/test_init 1/1
2025-12-04T11:12:38.2124022Z     distributed/_composable/test_checkpoint 1/1
2025-12-04T11:12:38.2124321Z     distributed/_tools/test_fsdp2_mem_tracker 1/1
2025-12-04T11:12:38.2124634Z     distributed/checkpoint/e2e/test_fine_tuning 1/1
2025-12-04T11:12:38.2124941Z     distributed/tensor/test_matrix_ops 1/1
2025-12-04T11:12:38.2125227Z     distributed/pipelining/test_stage 1/1
2025-12-04T11:12:38.2126155Z     distributed/tensor/parallel/test_tp_random_state 1/1
2025-12-04T11:12:38.2126475Z     distributed/checkpoint/test_planner 1/1
2025-12-04T11:12:38.2126784Z     distributed/checkpoint/test_dtensor_checkpoint 1/1
2025-12-04T11:12:38.2127098Z     distributed/pipelining/test_schedule 1/1
2025-12-04T11:12:38.2127438Z     distributed/_composable/fsdp/test_fully_shard_overlap 1/1
2025-12-04T11:12:38.2127761Z     distributed/test_run 1/1
2025-12-04T11:12:38.2128014Z     distributed/tensor/test_math_ops 1/1
2025-12-04T11:12:38.2128402Z     distributed/test_functional_api 1/1
2025-12-04T11:12:38.2128733Z     distributed/_composable/fsdp/test_fully_shard_compile 1/1
2025-12-04T11:12:38.2129068Z     distributed/_composable/test_replicate 1/1
2025-12-04T11:12:38.2129363Z     distributed/checkpoint/test_pg_transport 1/1
2025-12-04T11:12:38.2129723Z     distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1
2025-12-04T11:12:38.2130069Z     distributed/checkpoint/test_utils 1/1
2025-12-04T11:12:38.2130421Z     distributed/checkpoint/_experimental/test_checkpoint_process 1/1
2025-12-04T11:12:38.2130752Z     distributed/test_c10d_logger 1/1
2025-12-04T11:12:38.2130975Z     distributed/_composable/test_replicate_training 1/1
2025-12-04T11:12:38.2131230Z     distributed/optim/test_apply_optimizer_in_backward 1/1
2025-12-04T11:12:38.2131623Z     distributed/fsdp/test_fsdp_uneven 1/1
2025-12-04T11:12:38.2131843Z     distributed/tensor/test_op_strategy 1/1
2025-12-04T11:12:38.2132041Z     distributed/fsdp/test_fsdp_grad_acc 1/1
2025-12-04T11:12:38.2132261Z     distributed/checkpoint/test_state_dict_stager 1/1
2025-12-04T11:12:38.2132497Z     distributed/fsdp/test_fsdp_freezing_weights 1/1
2025-12-04T11:12:38.2132748Z     distributed/_composable/fsdp/test_fully_shard_init 1/1
2025-12-04T11:12:38.2132992Z     distributed/fsdp/test_fsdp_exec_order 1/1
2025-12-04T11:12:38.2133206Z     distributed/fsdp/test_fsdp_flatten_params 1/1
2025-12-04T11:12:38.2133417Z     distributed/test_distributed_spawn 3/7
2025-12-04T11:12:38.2133627Z     distributed/test_distributed_spawn 6/7
2025-12-04T11:12:38.2133830Z     distributed/fsdp/test_fsdp_traversal 1/1
2025-12-04T11:12:38.2134037Z     distributed/test_serialization 1/1
2025-12-04T11:12:38.2134267Z     distributed/fsdp/test_fsdp_multiple_wrapping 1/1
2025-12-04T11:12:38.2134502Z     distributed/fsdp/test_fsdp_ignored_modules 1/1
2025-12-04T11:12:38.2134735Z     distributed/fsdp/test_checkpoint_wrapper 1/1
2025-12-04T11:12:38.2134949Z     distributed/fsdp/test_fsdp_checkpoint 1/1
2025-12-04T11:12:38.2135155Z     distributed/fsdp/test_fsdp_fine_tune 1/1
2025-12-04T11:12:38.2135377Z     distributed/fsdp/test_fsdp_dtensor_state_dict 1/1
2025-12-04T11:12:38.2135608Z     distributed/fsdp/test_fsdp_comm_hooks 1/1
2025-12-04T11:12:38.2135818Z     distributed/fsdp/test_fsdp_hybrid_shard 1/1
2025-12-04T11:12:38.2136024Z     distributed/_shard/test_sharder 1/1
2025-12-04T11:12:38.2136263Z     distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1
2025-12-04T11:12:38.2136532Z     distributed/_shard/sharding_plan/test_sharding_plan 1/1
2025-12-04T11:12:38.2136767Z     distributed/fsdp/test_fsdp_comm 1/1
2025-12-04T11:12:38.2136966Z     distributed/test_c10d_pypg 1/1
2025-12-04T11:12:38.2137155Z     distributed/test_pg_wrapper 1/1
2025-12-04T11:12:38.2137351Z     distributed/tensor/test_utils 1/1
2025-12-04T11:12:38.2137561Z     distributed/fsdp/test_fsdp_unshard_params 1/1
2025-12-04T11:12:38.2137801Z     distributed/checkpoint/test_state_dict_utils 1/1
2025-12-04T11:12:38.2138037Z     distributed/_shard/sharded_tensor/ops/test_init 1/1
2025-12-04T11:12:38.2138335Z     distributed/_shard/sharded_tensor/ops/test_embedding 1/1
2025-12-04T11:12:38.2138613Z     distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1
2025-12-04T11:12:38.2138904Z     distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1
2025-12-04T11:12:38.2139155Z     distributed/fsdp/test_fsdp_core 1/3
2025-12-04T11:12:38.2139353Z     distributed/test_c10d_spawn_gloo 1/1
2025-12-04T11:12:38.2139549Z     distributed/test_c10d_spawn_ucc 1/1
2025-12-04T11:12:38.2139801Z     distributed/test_c10d_gloo 1/2
2025-12-04T11:12:38.2140006Z     distributed/fsdp/test_fsdp_mixed_precision 1/1
2025-12-04T11:12:38.2140215Z     distributed/test_c10d_nccl 2/3
2025-12-04T11:12:38.2140402Z     distributed/elastic/timer/api_test 1/1
2025-12-04T11:12:38.2140591Z   Parallel tests (0):
2025-12-04T11:12:38.2140780Z Name: excluded (est. time: 0.0min)
2025-12-04T11:12:38.2140914Z   Serial tests (0):
2025-12-04T11:12:38.2141035Z   Parallel tests (0):
2025-12-04T11:12:38.2141251Z Running distributed/test_dynamo_distributed 1/1 ... [2025-12-04 11:12:38.212203][2286256.861384774]
2025-12-04T11:12:38.2141502Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:12:38.2142004Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_dynamo_distributed.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:12:38.212405]
2025-12-04T11:20:29.1645267Z 
2025-12-04T11:20:29.1646314Z distributed/test_dynamo_distributed 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_dynamo_distributed_1.1_667e69f56c0d2ea5_.log
2025-12-04T11:20:29.1660367Z Running 62 items in this shard: test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_call_method_forward, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_ddp_optimizer_inductor_strides_dont_specialize, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_hf_bert_ddp_aot_eager, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_hf_bert_ddp_inductor, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_issue90375, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_symbol_splitting, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_direct, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_indirect, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_no_binding, test/distributed/test_dynamo_distributed.py::TestFakeDistributedSingleProc::test_unbacked_symbol_splitting_torture_multi, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_asymmetric_compilation, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_asymmetric_compilation_with_fx_cache, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_scalar, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_speculation_divergence, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_automatic_dynamic_tensor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_dim_mismatch, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_graph_break_empty_graph_still_collective, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_missing_source, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_scalar_missing_source, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_compiler_collectives_type_mismatch, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_baseline_aot_eager_multiprocess, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_ddp_optimizer_cudagraph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_aot_eager, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_inductor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_setattr, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_unspecialized_forced_getattr_inline, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_fsdp_unspecialized_forced_getattr_no_inline, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_get_pg_attr, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_guard_collective, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_aot_eager, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_aot_eager_static_graph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_inductor, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_ddp_inductor_static_graph, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_fsdp, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_hf_bert_fsdp_activation_checkpointing, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_multiproc_autotune, test/distributed/test_dynamo_distributed.py::TestMultiProc::test_multiproc_autotune_dynamic_shapes, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_aot_autograd, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_async_subclass_no_specialize, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_compiled_flex_attention_full_model_ddp, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_compiled_flex_attention_local_ddp, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_custom_layer, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ddp_baseline_aot_eager, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ddp_baseline_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_empty_graph_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_dup_tensors_diff_source, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_dup_tensors_same_source, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_orig_params_assert, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_skip_guards, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_skip_register_attr_or_module, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_fsdp_staticmethod, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_ctx_manager, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_layout_optimizations_inference, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_layout_optimizations_training, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_graph_split_inductor_transpose, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_higher_order_op, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_ignored_parameters, test/distributed/test_dynamo_distributed.py::TestSingleProc::test_no_split
2025-12-04T11:20:29.1670466Z 
2025-12-04T11:20:29.1670610Z Finished distributed/test_dynamo_distributed 1/1 ... [2025-12-04 11:20:29.164543][2286727.813723325], took 7.85min
2025-12-04T11:20:29.1671086Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:20:31.2085956Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:20:31.2086574Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T11:20:31.2086976Z Uploading artifacts took 0.00 seconds
2025-12-04T11:20:31.2087433Z Running distributed/pipelining/test_backward 1/1 ... [2025-12-04 11:20:31.208236][2286729.857415227]
2025-12-04T11:20:31.2088760Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:20:31.2089664Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/pipelining/test_backward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:20:31.208496]
2025-12-04T11:20:37.5320812Z 
2025-12-04T11:20:37.5322034Z distributed/pipelining/test_backward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_backward_1.1_bb427c1284ca5bca_.log
2025-12-04T11:20:37.5324250Z Running 5 items in this shard: test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_input_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_grad_validation_cuda, test/distributed/pipelining/test_backward.py::StageBackwardTestsCUDA::test_stage_backward_weight_multiple_iters_cuda
2025-12-04T11:20:37.5325788Z 
2025-12-04T11:20:37.5326047Z Finished distributed/pipelining/test_backward 1/1 ... [2025-12-04 11:20:37.531714][2286736.180893869], took 0.11min
2025-12-04T11:20:37.5327447Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:20:37.5345203Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:20:37.5346339Z Running distributed/tensor/test_dtensor 1/1 ... [2025-12-04 11:20:37.534498][2286736.18368196]
2025-12-04T11:20:37.5346628Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:20:37.5348056Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_dtensor.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:20:37.534676]
2025-12-04T11:23:34.3164353Z 
2025-12-04T11:23:34.3168654Z distributed/tensor/test_dtensor 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_1.1_23eb169f26f938b6_.log
2025-12-04T11:23:34.3185342Z Running 86 items in this shard: test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_async_output, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_constructor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_new_empty_strided, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_properties, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_save_load_import, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_spec_hash, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_spec_read_only_after_set, test/distributed/tensor/test_dtensor.py::DTensorTest::test_dtensor_stride, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_negative_dim, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_then_to_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding, test/distributed/tensor/test_dtensor.py::DTensorTest::test_from_local_uneven_sharding_raise_error, test/distributed/tensor/test_dtensor.py::DTensorTest::test_full_tensor_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTest::test_full_tensor_sync, test/distributed/tensor/test_dtensor.py::DTensorTest::test_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_modules_w_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_shard_tensor, test/distributed/tensor/test_dtensor.py::DTensorTest::test_shard_tensor_2d, test/distributed/tensor/test_dtensor.py::DTensorTest::test_to_local, test/distributed/tensor/test_dtensor.py::DTensorTest::test_to_local_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_async_output, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_constructor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_new_empty_strided, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_properties, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_save_load, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_save_load_import, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_spec_hash, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_spec_read_only_after_set, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_dtensor_stride, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_negative_dim, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_then_to_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_uneven_sharding, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_from_local_uneven_sharding_raise_error, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_full_tensor_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_full_tensor_sync, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_modules_w_meta_dtensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_shard_tensor, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_shard_tensor_2d, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_to_local, test/distributed/tensor/test_dtensor.py::DTensorTestWithLocalTensor::test_to_local_grad_hint, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_as_strided_identity, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_auto_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_default_value_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_device_mesh_nd, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_2d_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_api_device_mesh_context_manager, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_cond, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_device_mesh_device_conversion, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_dtensor_spec_local_shard_offset, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_from_local_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_inplace_on_local_tensor_view, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_metadata_consistency_check, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_redistribute_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTest::test_vmap_embedding, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_as_strided_identity, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_auto_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_default_value_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_device_mesh_nd, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_2d_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_api_device_mesh_context_manager, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_cond, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_device_mesh_device_conversion, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_dtensor_spec_local_shard_offset, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_from_local_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_implicit_replication, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_inplace_on_local_tensor_view, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_metadata_consistency_check, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_redistribute_sub_mesh, test/distributed/tensor/test_dtensor.py::DTensorMeshTestWithLocalTensor::test_vmap_embedding, test/distributed/tensor/test_dtensor.py::TestDTensorPlacementTypes::test_split_tensor_1D, test/distributed/tensor/test_dtensor.py::TestDTensorPlacementTypesWithLocalTensor::test_split_tensor_1D, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_default_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_default_shard_order_generation, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_print, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_update, test/distributed/tensor/test_dtensor.py::TestDTensorSpec::test_dtensor_spec_with_invalid_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_default_shard_order, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_default_shard_order_generation, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_print, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_update, test/distributed/tensor/test_dtensor.py::TestDTensorSpecWithLocalTensor::test_dtensor_spec_with_invalid_shard_order
2025-12-04T11:23:34.3197435Z 
2025-12-04T11:23:34.3197563Z Finished distributed/tensor/test_dtensor 1/1 ... [2025-12-04 11:23:34.316616][2286912.9657978], took 2.95min
2025-12-04T11:23:34.3198017Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:23:34.3198466Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:23:34.3198715Z Running distributed/tensor/test_redistribute 2/2 ... [2025-12-04 11:23:34.318729][2286912.967913642]
2025-12-04T11:23:34.3198928Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:23:34.3199345Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_redistribute.py', '--shard-id=2', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:23:34.318904]
2025-12-04T11:24:38.1808962Z 
2025-12-04T11:24:38.1810065Z distributed/tensor/test_redistribute 2/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_redistribute_2.2_f8b988b9ca5f7ec2_.log
2025-12-04T11:24:38.1821074Z Running 33 items in this shard: test/distributed/tensor/test_redistribute.py::RedistributeTest::test_one_chunk_mesh, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_partial_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_partial_to_shard_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_local_partial_grad_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_replicate_to_shard_forward_backward, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_dim_alltoall_float32, test/distributed/tensor/test_redistribute.py::RedistributeTest::test_shard_to_replicate_forward_backward_complex64, test/distributed/tensor/test_redistribute.py::MultiDimRedistributeTest::test_redistribute_shard_dim_multi_dim_mesh, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_generate_shard_orders, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_ordered_distribute_all_combination, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_ordered_redistribute_with_partial, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTest::test_shard_order_same_data_as_strided_shard, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_one_chunk_mesh, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_replicate_forward_backward_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_partial_to_shard_complex64, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_negative_shard_dim, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_to_partial, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_redistribute_uneven_sharding, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_partial, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_replicate_forward_backward, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_replicate_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_dim_alltoall_float32, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_to_replicate_forward_backward_datatype_conversion, test/distributed/tensor/test_redistribute.py::RedistributeTestWithLocalTensor::test_shard_to_replicate_forward_backward_float32, test/distributed/tensor/test_redistribute.py::MultiDimRedistributeTestWithLocalTensor::test_multi_dim_mesh, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_generate_shard_orders, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute_for_special_placement, test/distributed/tensor/test_redistribute.py::DistributeWithDeviceOrderTestWithLocalTensor::test_ordered_redistribute_with_partial
2025-12-04T11:24:38.1831224Z 
2025-12-04T11:24:38.1831460Z Finished distributed/tensor/test_redistribute 2/2 ... [2025-12-04 11:24:38.180603][2286976.829783644], took 1.06min
2025-12-04T11:24:38.1832188Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:24:38.1832838Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:24:38.1834336Z Running distributed/tensor/test_xla_integration 1/1 ... [2025-12-04 11:24:38.183325][2286976.832509256]
2025-12-04T11:24:38.1834672Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:24:38.1836431Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_xla_integration.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:38.183496]
2025-12-04T11:24:40.3514762Z 
2025-12-04T11:24:40.3515866Z distributed/tensor/test_xla_integration 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_xla_integration_1.1_4e7c95da93c4644a_.log
2025-12-04T11:24:40.3517577Z Running 3 items in this shard: test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_1d_replicate, test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_1d_shard, test/distributed/tensor/test_xla_integration.py::DTensorXLAIntegrationTest::test_xla_distribute_tensor_2d
2025-12-04T11:24:40.3518959Z 
2025-12-04T11:24:40.3519247Z Finished distributed/tensor/test_xla_integration 1/1 ... [2025-12-04 11:24:40.351098][2286979.000277074], took 0.04min
2025-12-04T11:24:40.3520167Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:24:40.3539370Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:24:40.3542059Z Running distributed/checkpoint/_experimental/test_types 1/1 ... [2025-12-04 11:24:40.354068][2286979.003251882]
2025-12-04T11:24:40.3542407Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:24:40.3543760Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_types.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:40.354240]
2025-12-04T11:24:42.5722294Z 
2025-12-04T11:24:42.5723686Z distributed/checkpoint/_experimental/test_types 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_types_1.1_5a37802355b2ddd8_.log
2025-12-04T11:24:42.5725475Z Running 3 items in this shard: test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_rank_info_default_initialization, test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_rank_info_initialization, test/distributed/checkpoint/_experimental/test_types.py::TestRankInfo::test_state_dict_type_alias
2025-12-04T11:24:42.5726710Z 
2025-12-04T11:24:42.5727027Z Finished distributed/checkpoint/_experimental/test_types 1/1 ... [2025-12-04 11:24:42.571896][2286981.22107634], took 0.04min
2025-12-04T11:24:42.5727984Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:24:42.5747207Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:24:42.5749737Z Running distributed/tensor/experimental/test_register_sharding 1/1 ... [2025-12-04 11:24:42.574846][2286981.224030038]
2025-12-04T11:24:42.5750124Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:24:42.5751661Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/experimental/test_register_sharding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:42.575031]
2025-12-04T11:24:58.3624288Z 
2025-12-04T11:24:58.3625908Z distributed/tensor/experimental/test_register_sharding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.experimental.test_register_sharding_1.1_f0eac74a87d7a376_.log
2025-12-04T11:24:58.3628502Z Running 3 items in this shard: test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_argmax, test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_register_sharding_for_tensor_kwargs, test/distributed/tensor/experimental/test_register_sharding.py::TestRegisterSharding::test_softmax_fwd
2025-12-04T11:24:58.3629082Z 
2025-12-04T11:24:58.3629250Z Finished distributed/tensor/experimental/test_register_sharding 1/1 ... [2025-12-04 11:24:58.362059][2286997.011239099], took 0.26min
2025-12-04T11:24:58.3629762Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:24:58.3647822Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:24:58.3650373Z Running distributed/tensor/test_tensor_ops 1/1 ... [2025-12-04 11:24:58.364903][2286997.014086689]
2025-12-04T11:24:58.3650834Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:24:58.3652111Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_tensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:24:58.365079]
2025-12-04T11:27:25.6719710Z 
2025-12-04T11:27:25.6720724Z distributed/tensor/test_tensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_tensor_ops_1.1_f0e0b15364e85b24_.log
2025-12-04T11:27:25.6734062Z Running 62 items in this shard: test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_aten_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_clone, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_copy_, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_detach, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_dtensor_dtype_conversion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_empty_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_equal, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_fill_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_fill_inplace_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_full_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_gather, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index_put_scalar, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_index_put_tensor, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_inplace_op, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_new_empty_strided, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_new_full, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_ones_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_ones_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_op_out_variant, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_scatter, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_slice, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_split_on_partial, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_stack, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_stack_cache, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_unbind, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_where_type_promotion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zero_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zeros_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTest::test_zeros_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_aten_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_clone, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_contiguous, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_copy_, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_detach, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_dtensor_dtype_conversion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_empty_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_equal, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_fill_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_fill_inplace_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_full_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_gather, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index_put_scalar, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_index_put_tensor, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_inplace_op, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_new_empty_strided, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_new_full, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_ones_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_ones_like_partial_sum, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_op_out_variant, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_scatter, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_slice, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_split_on_partial, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_stack, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_stack_cache, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_unbind, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_where_type_promotion, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zero_inplace, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zeros_like, test/distributed/tensor/test_tensor_ops.py::DistTensorOpsTestWithLocalTensor::test_zeros_like_partial_sum
2025-12-04T11:27:25.6743259Z 
2025-12-04T11:27:25.6743416Z Finished distributed/tensor/test_tensor_ops 1/1 ... [2025-12-04 11:27:25.671647][2287144.320827578], took 2.46min
2025-12-04T11:27:25.6743928Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:27:25.6744351Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:27:25.6744610Z Running distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 ... [2025-12-04 11:27:25.674302][2287144.323486351]
2025-12-04T11:27:25.6744823Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:27:25.6745843Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/fsdp/test_fsdp_dsd.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:27:25.674471]
2025-12-04T11:28:13.4636186Z 
2025-12-04T11:28:13.4638546Z distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.fsdp.test_fsdp_dsd_1.1_5ae14876d5b52090_.log
2025-12-04T11:28:13.4642503Z Running 6 items in this shard: test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_1d_fsdp_cpu_offload_full_model_state_dict, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_1d_fsdp_get_model_state_dict, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp1_and_load_with_fsdp2, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp1_and_load_with_fsdp2_tp, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_fsdp2_tp_and_load_with_tp, test/distributed/checkpoint/fsdp/test_fsdp_dsd.py::TestFullyShardWithDistributedStateDict::test_save_with_tp_and_load_with_fsdp2_tp
2025-12-04T11:28:13.4645249Z 
2025-12-04T11:28:13.4645588Z Finished distributed/checkpoint/fsdp/test_fsdp_dsd 1/1 ... [2025-12-04 11:28:13.463371][2287192.112550391], took 0.80min
2025-12-04T11:28:13.4646610Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:28:13.4662134Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:28:13.4664651Z Running distributed/tensor/debug/test_comm_mode_features 1/1 ... [2025-12-04 11:28:13.466352][2287192.115535908]
2025-12-04T11:28:13.4666241Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:28:13.4666872Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/debug/test_comm_mode_features.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:13.466536]
2025-12-04T11:28:45.0817415Z 
2025-12-04T11:28:45.0818508Z distributed/tensor/debug/test_comm_mode_features 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.debug.test_comm_mode_features_1.1_cc58908746ac96e0_.log
2025-12-04T11:28:45.0822597Z Running 4 items in this shard: test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLPStacked_distributed_sharding_display, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLP_distributed_sharding_display, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_MLP_module_tracing, test/distributed/tensor/debug/test_comm_mode_features.py::TestCommModeFeatures::test_transformer_module_tracing
2025-12-04T11:28:45.0823559Z 
2025-12-04T11:28:45.0823740Z Finished distributed/tensor/debug/test_comm_mode_features 1/1 ... [2025-12-04 11:28:45.081381][2287223.73056136], took 0.53min
2025-12-04T11:28:45.0824266Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:28:45.0841592Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:28:45.0844317Z Running distributed/tensor/test_dtensor_ops 1/1 ... [2025-12-04 11:28:45.084305][2287223.733489309]
2025-12-04T11:28:45.0844750Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:28:45.0846032Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_dtensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:45.084474]
2025-12-04T11:28:48.2508068Z 
2025-12-04T11:28:48.2509253Z distributed/tensor/test_dtensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_dtensor_ops_1.1_e7e03ffb1fd8c0ba_.log
2025-12-04T11:28:48.2509792Z Running 0 items in this shard:
2025-12-04T11:28:48.2510354Z 
2025-12-04T11:28:48.2510570Z Finished distributed/tensor/test_dtensor_ops 1/1 ... [2025-12-04 11:28:48.250524][2287226.899704378], took 0.05min
2025-12-04T11:28:48.2512437Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:28:48.2529756Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:28:48.2532570Z Running distributed/tensor/test_init 1/1 ... [2025-12-04 11:28:48.253172][2287226.902355781]
2025-12-04T11:28:48.2532836Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:28:48.2534718Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:28:48.253342]
2025-12-04T11:29:22.4704135Z 
2025-12-04T11:29:22.4707798Z distributed/tensor/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_init_1.1_302246374c6efe1a_.log
2025-12-04T11:29:22.4710000Z Running 13 items in this shard: test/distributed/tensor/test_init.py::DTensorInitOpsTest::test_init_ops, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_empty, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_full, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_ones, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros_full_mesh, test/distributed/tensor/test_init.py::DTensorConstructorTest::test_zeros_submesh, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_empty, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_full, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_ones, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros_full_mesh, test/distributed/tensor/test_init.py::DTensorConstructorTestWithLocalTensor::test_zeros_submesh
2025-12-04T11:29:22.4712239Z 
2025-12-04T11:29:22.4712364Z Finished distributed/tensor/test_init 1/1 ... [2025-12-04 11:29:22.470023][2287261.11920278], took 0.57min
2025-12-04T11:29:22.4714496Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:29:22.4725555Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:29:22.4728011Z Running distributed/_composable/test_checkpoint 1/1 ... [2025-12-04 11:29:22.472703][2287261.121886643]
2025-12-04T11:29:22.4728286Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:29:22.4730149Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/test_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:29:22.472887]
2025-12-04T11:29:28.0456796Z 
2025-12-04T11:29:28.0458005Z distributed/_composable/test_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_checkpoint_1.1_1193b4dea4e22f77_.log
2025-12-04T11:29:28.0460984Z Running 6 items in this shard: test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_checkpoint_kwargs, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_clears_state_on_error_in_forward, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_multi_args, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_random_cpu, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_tensor_only_cpu, test/distributed/_composable/test_checkpoint.py::TestCheckpoint::test_tensor_only_gpu
2025-12-04T11:29:28.0462458Z 
2025-12-04T11:29:28.0462724Z Finished distributed/_composable/test_checkpoint 1/1 ... [2025-12-04 11:29:28.045318][2287266.694497399], took 0.09min
2025-12-04T11:29:28.0463570Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:29:28.0480338Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:29:28.0482396Z Running distributed/_tools/test_fsdp2_mem_tracker 1/1 ... [2025-12-04 11:29:28.048152][2287266.697336159]
2025-12-04T11:29:28.0482706Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:29:28.0484287Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_tools/test_fsdp2_mem_tracker.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:29:28.048321]
2025-12-04T11:30:00.9584015Z 
2025-12-04T11:30:00.9584872Z distributed/_tools/test_fsdp2_mem_tracker 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._tools.test_fsdp2_mem_tracker_1.1_ff74fe95d0881805_.log
2025-12-04T11:30:00.9587199Z Running 3 items in this shard: test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCore::test_tracker_multi_group_eager, test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCore::test_tracker_non_root_forward_backward, test/distributed/_tools/test_fsdp2_mem_tracker.py::TestTrackerFullyShard1DTrainingCompose::test_tracker_with_activation_checkpointing
2025-12-04T11:30:00.9588586Z 
2025-12-04T11:30:00.9588906Z Finished distributed/_tools/test_fsdp2_mem_tracker 1/1 ... [2025-12-04 11:30:00.958028][2287299.607208383], took 0.55min
2025-12-04T11:30:00.9590389Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:30:00.9611599Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:30:00.9612643Z Running distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2025-12-04 11:30:00.961074][2287299.610258469]
2025-12-04T11:30:00.9612980Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:30:00.9613830Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/e2e/test_fine_tuning.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:00.961248]
2025-12-04T11:30:20.4067577Z 
2025-12-04T11:30:20.4069288Z distributed/checkpoint/e2e/test_fine_tuning 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.e2e.test_fine_tuning_1.1_f4af570b33c9e31a_.log
2025-12-04T11:30:20.4070719Z Running 1 items in this shard: test/distributed/checkpoint/e2e/test_fine_tuning.py::TestFineTuning::test_fine_tuning
2025-12-04T11:30:20.4071246Z 
2025-12-04T11:30:20.4071657Z Finished distributed/checkpoint/e2e/test_fine_tuning 1/1 ... [2025-12-04 11:30:20.406549][2287319.055729023], took 0.32min
2025-12-04T11:30:20.4074908Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:30:20.4093365Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:30:20.4097218Z Running distributed/tensor/test_matrix_ops 1/1 ... [2025-12-04 11:30:20.409444][2287319.058628583]
2025-12-04T11:30:20.4097558Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:30:20.4098892Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_matrix_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:30:20.409619]
2025-12-04T11:31:58.3401623Z 
2025-12-04T11:31:58.3402735Z distributed/tensor/test_matrix_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_matrix_ops_1.1_8a1aea47df570e83_.log
2025-12-04T11:31:58.3412338Z Running 30 items in this shard: test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_auto_redistribute, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_addmm_empty_operand, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_baddbmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_bmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_dtensor_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_grouped_mm_kwargs0, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_grouped_mm_kwargs1, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_matmul, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_dot_product_attention, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_scaled_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_t_partial, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTest::test_tensordot_shampoo, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm_auto_redistribute, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_addmm_empty_operand, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_baddbmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_bmm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_dtensor_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_grouped_mm_kwargs0, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_grouped_mm_kwargs1, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_matmul, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_scaled_dot_product_attention, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_scaled_mm, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_t, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_t_partial, test/distributed/tensor/test_matrix_ops.py::DistMatrixOpsTestWithLocalTensor::test_tensordot_shampoo
2025-12-04T11:31:58.3420406Z 
2025-12-04T11:31:58.3420598Z Finished distributed/tensor/test_matrix_ops 1/1 ... [2025-12-04 11:31:58.339776][2287416.988955446], took 1.63min
2025-12-04T11:31:58.3421206Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:31:58.3424880Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:31:58.3427346Z Running distributed/pipelining/test_stage 1/1 ... [2025-12-04 11:31:58.342625][2287416.991808566]
2025-12-04T11:31:58.3427570Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:31:58.3429358Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/pipelining/test_stage.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:31:58.342801]
2025-12-04T11:32:25.0473401Z 
2025-12-04T11:32:25.0473905Z distributed/pipelining/test_stage 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_stage_1.1_f07eb832c6792751_.log
2025-12-04T11:32:25.0476002Z Running 8 items in this shard: test/distributed/pipelining/test_stage.py::StageTest::test_custom_dw_with_fb_schedule, test/distributed/pipelining/test_stage.py::StageTest::test_manual, test/distributed/pipelining/test_stage.py::StageTest::test_output_chunks_memory_usage, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_ModelClass0, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_ModelClass1, test/distributed/pipelining/test_stage.py::StageTest::test_tracer_kwargs_ModelClass0, test/distributed/pipelining/test_stage.py::StageNegativeTest::test_custom_dw_errors, test/distributed/pipelining/test_stage.py::StageNegativeTest::test_shape_prop_mismatch
2025-12-04T11:32:25.0477598Z 
2025-12-04T11:32:25.0477818Z Finished distributed/pipelining/test_stage 1/1 ... [2025-12-04 11:32:25.046954][2287443.696134427], took 0.45min
2025-12-04T11:32:25.0478754Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:32:25.0495172Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:32:25.0497218Z Running distributed/tensor/parallel/test_tp_random_state 1/1 ... [2025-12-04 11:32:25.049630][2287443.69881402]
2025-12-04T11:32:25.0497525Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:32:25.0499405Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/parallel/test_tp_random_state.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:25.049798]
2025-12-04T11:32:33.3270671Z 
2025-12-04T11:32:33.3271695Z distributed/tensor/parallel/test_tp_random_state 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.parallel.test_tp_random_state_1.1_bdd7d70d1ebe3f35_.log
2025-12-04T11:32:33.3275696Z Running 1 items in this shard: test/distributed/tensor/parallel/test_tp_random_state.py::TensorParallelRandomStateTests::test_model_init
2025-12-04T11:32:33.3276361Z 
2025-12-04T11:32:33.3276812Z Finished distributed/tensor/parallel/test_tp_random_state 1/1 ... [2025-12-04 11:32:33.326692][2287451.975871114], took 0.14min
2025-12-04T11:32:33.3279402Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:32:33.3297694Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:32:33.3301535Z Running distributed/checkpoint/test_planner 1/1 ... [2025-12-04 11:32:33.329859][2287451.979043479]
2025-12-04T11:32:33.3301821Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:32:33.3302643Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_planner.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:33.330028]
2025-12-04T11:32:35.5478861Z 
2025-12-04T11:32:35.5479406Z distributed/checkpoint/test_planner 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_planner_1.1_844b415c886f474f_.log
2025-12-04T11:32:35.5484092Z Running 17 items in this shard: test/distributed/checkpoint/test_planner.py::TestSavePlan::test_dedup_plans, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_finish_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_global_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_global_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_load_with_resharding, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_load_with_world_size_diff_by_one, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_load_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_plan, test/distributed/checkpoint/test_planner.py::TestSavePlan::test_local_plan_with_caching, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_compare_save_plans, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_create_read_item_from_chunks, test/distributed/checkpoint/test_planner.py::TestPlannerHelpers::test_merge_delta_local_plans, test/distributed/checkpoint/test_planner.py::TestValidateGlobalPlan::test_detect_overlapping_chunks, test/distributed/checkpoint/test_planner.py::TestValidateGlobalPlan::test_non_overlapping_chunks, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_load_different_sizes_throws, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_strict, test/distributed/checkpoint/test_planner.py::TestLoadPlanner::test_version_key_in_planner_data
2025-12-04T11:32:35.5487866Z 
2025-12-04T11:32:35.5488270Z Finished distributed/checkpoint/test_planner 1/1 ... [2025-12-04 11:32:35.547513][2287454.196693529], took 0.04min
2025-12-04T11:32:35.5489022Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:32:35.5503817Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:32:35.5506667Z Running distributed/checkpoint/test_dtensor_checkpoint 1/1 ... [2025-12-04 11:32:35.550528][2287454.199711956]
2025-12-04T11:32:35.5506955Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:32:35.5508452Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_dtensor_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:35.550704]
2025-12-04T11:32:42.8758703Z 
2025-12-04T11:32:42.8759850Z distributed/checkpoint/test_dtensor_checkpoint 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_dtensor_checkpoint_1.1_e24346b9f1951dfb_.log
2025-12-04T11:32:42.8760944Z Running 1 items in this shard: test/distributed/checkpoint/test_dtensor_checkpoint.py::DTensorPlanner::test_distributed_tensor_planner
2025-12-04T11:32:42.8761392Z 
2025-12-04T11:32:42.8761708Z Finished distributed/checkpoint/test_dtensor_checkpoint 1/1 ... [2025-12-04 11:32:42.875515][2287461.524695434], took 0.12min
2025-12-04T11:32:42.8766176Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:32:42.8784366Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:32:42.8786895Z Running distributed/pipelining/test_schedule 1/1 ... [2025-12-04 11:32:42.878555][2287461.52773934]
2025-12-04T11:32:42.8787239Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:32:42.8788591Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/pipelining/test_schedule.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:32:42.878731]
2025-12-04T11:33:08.3326422Z 
2025-12-04T11:33:08.3327089Z distributed/pipelining/test_schedule 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.pipelining.test_schedule_1.1_ce7bd12d8f7e2c87_.log
2025-12-04T11:33:08.3334834Z Running 43 items in this shard: test/distributed/pipelining/test_schedule.py::ScheduleTest::test_get_schedule_class, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass2, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass3, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_eval_then_train_ScheduleClass4, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass2, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass3, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_schedule_with_single_stage_ScheduleClass4, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass0, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass1, test/distributed/pipelining/test_schedule.py::ScheduleTest::test_zero_bubble_schedule_errors_with_compile_ScheduleClass2, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_flex_and_zero_bubble_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_flex_and_zero_bubble_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_for_v_schedules_ScheduleClass0, test/distributed/pipelining/test_schedule.py::TestSchedulePlan::test_pipeline_order_for_v_schedules_ScheduleClass1, test/distributed/pipelining/test_schedule.py::TestScheduleCsv::test_csv_compare_ScheduleClass0_csv_name_dualpipev_4rank_10mb, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref2, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref3, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref4, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref5, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref6, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_action_parse_action_str_and_ref7, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_csv_csv_name_zb1p_2rank_2stagep, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_grad_with_split_b_w, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_grad_with_v_schedule, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_merge_bw_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_reduce_grad_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_reduce_grad_test_info1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_send_recv_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_send_recv_test_info1, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_unshard_reshard_test_info0, test/distributed/pipelining/test_schedule.py::TestScheduleLowering::test_unshard_reshard_test_info1, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_invalid_schedule_missing_action, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_invalid_schedule_missing_rank, test/distributed/pipelining/test_schedule.py::TestValidateSchedule::test_valid_schedule, test/distributed/pipelining/test_schedule.py::ScheduleUtilTests::test_generate_stage_to_rank_mapping
2025-12-04T11:33:08.3341416Z 
2025-12-04T11:33:08.3341556Z Finished distributed/pipelining/test_schedule 1/1 ... [2025-12-04 11:33:08.332246][2287486.981425943], took 0.42min
2025-12-04T11:33:08.3342000Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:33:08.3350004Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:33:08.3353029Z Running distributed/_composable/fsdp/test_fully_shard_overlap 1/1 ... [2025-12-04 11:33:08.335190][2287486.984374561]
2025-12-04T11:33:08.3353489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:33:08.3355131Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_overlap.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:08.335374]
2025-12-04T11:33:18.6153574Z 
2025-12-04T11:33:18.6155010Z distributed/_composable/fsdp/test_fully_shard_overlap 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_overlap_1.1_f0dbe397233484d2_.log
2025-12-04T11:33:18.6157063Z Running 2 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_overlap.py::TestFullyShardOverlap::test_fully_shard_post_optim_event_overlap, test/distributed/_composable/fsdp/test_fully_shard_overlap.py::TestFullyShardOverlap::test_fully_shard_training_overlap
2025-12-04T11:33:18.6158452Z 
2025-12-04T11:33:18.6158949Z Finished distributed/_composable/fsdp/test_fully_shard_overlap 1/1 ... [2025-12-04 11:33:18.615018][2287497.264198708], took 0.17min
2025-12-04T11:33:18.6160288Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:33:18.6178884Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:33:18.6181216Z Running distributed/test_run 1/1 ... [2025-12-04 11:33:18.618006][2287497.267189576]
2025-12-04T11:33:18.6181554Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:33:18.6183125Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_run.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:18.618191]
2025-12-04T11:33:20.8359751Z 
2025-12-04T11:33:20.8360501Z distributed/test_run 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_run_1.1_21fea8d12c472afb_.log
2025-12-04T11:33:20.8361679Z Running 4 items in this shard: test/distributed/test_run.py::RunTest::test_config_from_args_signals_to_handle, test/distributed/test_run.py::RunTest::test_launch_agent_sets_environment_variable, test/distributed/test_run.py::RunTest::test_signals_to_handle_custom, test/distributed/test_run.py::RunTest::test_signals_to_handle_default
2025-12-04T11:33:20.8362470Z 
2025-12-04T11:33:20.8362655Z Finished distributed/test_run 1/1 ... [2025-12-04 11:33:20.835746][2287499.484926406], took 0.04min
2025-12-04T11:33:20.8368432Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:33:20.8386851Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:33:20.8390716Z Running distributed/tensor/test_math_ops 1/1 ... [2025-12-04 11:33:20.838830][2287499.488014322]
2025-12-04T11:33:20.8391190Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:33:20.8391999Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_math_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:33:20.839012]
2025-12-04T11:35:49.3256312Z 
2025-12-04T11:35:49.3257121Z distributed/tensor/test_math_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_math_ops_1.1_85a1a8506d37fc70_.log
2025-12-04T11:35:49.3269374Z Running 54 items in this shard: test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_conj_complex_dtensor, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_cumsum, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_add_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_foreach_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_histc, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_bwd_req_grad, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_layer_norm_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_linalg_eigh, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_linear_op_reductions, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_logsumexp, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_matching_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_mean, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_nll_loss_and_cross_entropy, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_rotary_embedding_complex_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_shard0_svd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_shard_math_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_softmax_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_softmax_with_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_std, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_topk, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_upsampling, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_vector_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTest::test_vector_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_conj_complex_dtensor, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_cumsum, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_add_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm_different_mesh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_foreach_norm_partial, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_histc, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_bwd_req_grad, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_layer_norm_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_linalg_eigh, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_linear_op_reductions, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_logsumexp, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_matching_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_mean, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_nll_loss_and_cross_entropy, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_partial_reduction_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_rotary_embedding_complex_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_shard0_svd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_shard_math_ops, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_softmax_fwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_softmax_with_bwd, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_std, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_topk, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_upsampling, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_vector_norm, test/distributed/tensor/test_math_ops.py::DistMathOpsTestWithLocalTensor::test_vector_norm_partial
2025-12-04T11:35:49.3277898Z 
2025-12-04T11:35:49.3278060Z Finished distributed/tensor/test_math_ops 1/1 ... [2025-12-04 11:35:49.325416][2287647.974595905], took 2.47min
2025-12-04T11:35:49.3278616Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:35:49.3282917Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:35:49.3285632Z Running distributed/test_functional_api 1/1 ... [2025-12-04 11:35:49.328479][2287647.977662191]
2025-12-04T11:35:49.3285837Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:35:49.3287440Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_functional_api.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:35:49.328654]
2025-12-04T11:37:46.3629929Z 
2025-12-04T11:37:46.3633899Z distributed/test_functional_api 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_functional_api_1.1_06d3bb52f6c4d2e0_.log
2025-12-04T11:37:46.3638902Z Running 11 items in this shard: test/distributed/test_functional_api.py::TestMetaCollectives::test_all_reduce, test/distributed/test_functional_api.py::TestMakeFx::test_all_reduce_tracing, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_gather_into_tensor_coalesced_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_1d_input_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_all_to_all_single_split_sizes_none_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_dce_code_cuda, test/distributed/test_functional_api.py::TestCollectivesWithDistributedBackendCUDA::test_tracing_with_fakepg_cuda, test/distributed/test_functional_api.py::TestDistributedBackendCollectivesWithWorldSize4CUDA::test_permute_tensor_with_sub_group_cuda, test/distributed/test_functional_api.py::TestFunctionalAutogradWithDistributedBackendCUDA::test_all_to_all_single_cuda
2025-12-04T11:37:46.3642530Z 
2025-12-04T11:37:46.3643272Z Finished distributed/test_functional_api 1/1 ... [2025-12-04 11:37:46.362570][2287765.011750863], took 1.95min
2025-12-04T11:37:46.3643962Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:37:46.3655624Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:37:46.3656081Z Running distributed/_composable/fsdp/test_fully_shard_compile 1/1 ... [2025-12-04 11:37:46.365471][2287765.014655372]
2025-12-04T11:37:46.3656376Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:37:46.3658082Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_compile.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:37:46.365644]
2025-12-04T11:42:23.0934882Z 
2025-12-04T11:42:23.0935894Z distributed/_composable/fsdp/test_fully_shard_compile 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_compile_1.1_5c36632b5155c6d2_.log
2025-12-04T11:42:23.0943049Z Running 18 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompileCompute::test_disable_compiling_hooks, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_compiled_autograd_ctx, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_dynamo_recompiles_on_fsdp_layers, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_dynamo_trace_use_training_state, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_aot_eager, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_aot_eager_decomp_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor_fullgraph_False, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor_fullgraph_True, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_nested_fully_shard_backend_inductor_fullgraph_True_graph_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_aot_eager_decomp_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_simple_mlp_fullgraph_backend_inductor, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_trace_fsdp_copy_, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_aot_eager, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_aot_eager_decomp_partition, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_False, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_True, test/distributed/_composable/fsdp/test_fully_shard_compile.py::TestFullyShardCompile::test_transformer_backend_inductor_fullgraph_True_graph_partition
2025-12-04T11:42:23.0950230Z 
2025-12-04T11:42:23.0950480Z Finished distributed/_composable/fsdp/test_fully_shard_compile 1/1 ... [2025-12-04 11:42:23.093268][2288041.742449257], took 4.61min
2025-12-04T11:42:23.0951117Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:42:23.0958508Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:42:23.0958816Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T11:42:23.0959060Z Uploading artifacts took 0.00 seconds
2025-12-04T11:42:23.0961417Z Running distributed/_composable/test_replicate 1/1 ... [2025-12-04 11:42:23.096002][2288041.745186339]
2025-12-04T11:42:23.0961978Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:42:23.0963179Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/test_replicate.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:42:23.096175]
2025-12-04T11:43:20.9981534Z 
2025-12-04T11:43:20.9982740Z distributed/_composable/test_replicate 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_1.1_aa1eb9fb0e3bb004_.log
2025-12-04T11:43:20.9989265Z Running 17 items in this shard: test/distributed/_composable/test_replicate.py::ReplicateStateDictTest::test_replicate_non_root_multiple_save_load, test/distributed/_composable/test_replicate.py::ReplicateStateDictTest::test_replicate_single_module_save_load, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_device_id, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_ignore_module, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_move_args_kwargs_to_device, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_multi_module, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_single_module, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_with_kwargs, test/distributed/_composable/test_replicate.py::ReplicateTest::test_replicate_wrong_device_id_type, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_device_id, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_fully_shard_init, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_ignore_module, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_move_args_kwargs_to_device, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_multi_module, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_single_module, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_with_kwargs, test/distributed/_composable/test_replicate.py::ReplicateFullyShardInit::test_replicate_wrong_device_id_type
2025-12-04T11:43:20.9994979Z 
2025-12-04T11:43:20.9995254Z Finished distributed/_composable/test_replicate 1/1 ... [2025-12-04 11:43:20.997677][2288099.646856593], took 0.97min
2025-12-04T11:43:20.9996011Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:43:21.0010281Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:43:21.0011684Z Running distributed/checkpoint/test_pg_transport 1/1 ... [2025-12-04 11:43:21.001007][2288099.650191034]
2025-12-04T11:43:21.0011942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:43:21.0013849Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_pg_transport.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:43:21.001184]
2025-12-04T11:43:31.1314417Z 
2025-12-04T11:43:31.1316577Z distributed/checkpoint/test_pg_transport 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_pg_transport_1.1_a804652e5136a4d7_.log
2025-12-04T11:43:31.1323297Z Running 21 items in this shard: test/distributed/checkpoint/test_pg_transport.py::PgTransportCPU::test_pg_transport, test/distributed/checkpoint/test_pg_transport.py::PgTransportCPU::test_pg_transport_with_mixed_content, test/distributed/checkpoint/test_pg_transport.py::PgTransportCPU::test_pg_transport_with_sharded_tensor, test/distributed/checkpoint/test_pg_transport.py::PgTransportGPU::test_pg_transport, test/distributed/checkpoint/test_pg_transport.py::PgTransportGPU::test_pg_transport_with_mixed_content, test/distributed/checkpoint/test_pg_transport.py::PgTransportGPU::test_pg_transport_with_sharded_tensor, test/distributed/checkpoint/test_pg_transport.py::TestCastTensor::test_cast_tensor_different_dtypes, test/distributed/checkpoint/test_pg_transport.py::TestCastTensor::test_cast_tensor_with_offset, test/distributed/checkpoint/test_pg_transport.py::TestCastTensor::test_cast_tensor_with_stride, test/distributed/checkpoint/test_pg_transport.py::TestPrepareTensor::test_prepare_tensor_basic, test/distributed/checkpoint/test_pg_transport.py::TestPrepareTensor::test_prepare_tensor_different_shapes, test/distributed/checkpoint/test_pg_transport.py::TestPrepareTensor::test_prepare_tensor_with_stride, test/distributed/checkpoint/test_pg_transport.py::TestPrepareStateDict::test_prepare_state_dict_basic, test/distributed/checkpoint/test_pg_transport.py::TestPrepareStateDict::test_prepare_state_dict_nested, test/distributed/checkpoint/test_pg_transport.py::TestPrepareStateDict::test_prepare_state_dict_with_non_tensor_values, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_recv_checkpoint_basic, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_recv_checkpoint_with_state_dict_callback, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_send_checkpoint_basic, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_send_checkpoint_empty_state_dict, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportMocked::test_send_checkpoint_with_non_tensor_values, test/distributed/checkpoint/test_pg_transport.py::TestPGTransportEdgeCases::test_send_checkpoint_with_cpu_tensors
2025-12-04T11:43:31.1328794Z 
2025-12-04T11:43:31.1329024Z Finished distributed/checkpoint/test_pg_transport 1/1 ... [2025-12-04 11:43:31.131177][2288109.780356926], took 0.17min
2025-12-04T11:43:31.1329722Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:43:31.1343761Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:43:31.1346857Z Running distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 ... [2025-12-04 11:43:31.134501][2288109.783685228]
2025-12-04T11:43:31.1347454Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:43:31.1348615Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_mixed_precision.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:43:31.134681]
2025-12-04T11:44:23.9311851Z 
2025-12-04T11:44:23.9317186Z distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_mixed_precision_1.1_dab913226be0626b_.log
2025-12-04T11:44:23.9323792Z Running 9 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionTraining::test_compute_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionTraining::test_grad_acc_with_reduce_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionTraining::test_reduce_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_clamp_reduce_dtype, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_dataclass_input, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_float16_on_one_submodule, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_norm_modules_bf16, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_norm_modules_fp16, test/distributed/_composable/fsdp/test_fully_shard_mixed_precision.py::TestFullyShardMixedPrecisionCasts::test_submodules_with_external_inputs
2025-12-04T11:44:23.9327992Z 
2025-12-04T11:44:23.9328455Z Finished distributed/_composable/fsdp/test_fully_shard_mixed_precision 1/1 ... [2025-12-04 11:44:23.930730][2288162.579909763], took 0.88min
2025-12-04T11:44:23.9329539Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:44:23.9339269Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:44:23.9341658Z Running distributed/checkpoint/test_utils 1/1 ... [2025-12-04 11:44:23.934013][2288162.583197205]
2025-12-04T11:44:23.9341939Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:44:23.9344907Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:44:23.934196]
2025-12-04T11:44:52.2893333Z 
2025-12-04T11:44:52.2894248Z distributed/checkpoint/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_utils_1.1_8e3cc81d9cc30468_.log
2025-12-04T11:44:52.2899079Z Running 16 items in this shard: test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_dcp_logger, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_flat_data, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_index_hint_ignored_on_equals, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_index_hint_ignored_on_hash, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_init_convert_offset, test/distributed/checkpoint/test_utils.py::TestMedatadaIndex::test_sharded_tensor_lookup, test/distributed/checkpoint/test_utils.py::TestReaderView::testAllRead, test/distributed/checkpoint/test_utils.py::TestReaderView::testLongRead, test/distributed/checkpoint/test_utils.py::TestReaderView::testLongReadinto, test/distributed/checkpoint/test_utils.py::TestReaderView::testShortRead, test/distributed/checkpoint/test_utils.py::TestReaderView::testShortReadinto, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_barrier, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_broadcast_object_global_local_mismatch, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_broadcast_object_with_nonzero_coordinator, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_gather_object, test/distributed/checkpoint/test_utils.py::TestDistWrapper::test_scatter_object
2025-12-04T11:44:52.2902680Z 
2025-12-04T11:44:52.2902906Z Finished distributed/checkpoint/test_utils 1/1 ... [2025-12-04 11:44:52.289121][2288190.938301014], took 0.47min
2025-12-04T11:44:52.2905971Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:44:52.2922645Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:44:52.2925376Z Running distributed/checkpoint/_experimental/test_checkpoint_process 1/1 ... [2025-12-04 11:44:52.292357][2288190.941541017]
2025-12-04T11:44:52.2925689Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:44:52.2926281Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/_experimental/test_checkpoint_process.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:44:52.292526]
2025-12-04T11:45:11.4341542Z 
2025-12-04T11:45:11.4342546Z distributed/checkpoint/_experimental/test_checkpoint_process 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint._experimental.test_checkpoint_process_1.1_f38997afd754e436_.log
2025-12-04T11:45:11.4347595Z Running 15 items in this shard: test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestRequestTypes::test_request_type_enum, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestRequestTypes::test_worker_request, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestRequestTypes::test_worker_response, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcessConfig::test_custom_options, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcessConfig::test_default_options, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_process_initialization, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_write_future_state_dict, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_write_sync_state_dict, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_checkpoint_write_with_kwargs, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_communication_error_handling, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_forced_termination, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_graceful_termination, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_shared_memory_tensor_ipc, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_subprocess_initialization_failure, test/distributed/checkpoint/_experimental/test_checkpoint_process.py::TestCheckpointProcess::test_subprocess_initialization_timeout
2025-12-04T11:45:11.4352738Z 
2025-12-04T11:45:11.4353002Z Finished distributed/checkpoint/_experimental/test_checkpoint_process 1/1 ... [2025-12-04 11:45:11.433900][2288210.083081142], took 0.32min
2025-12-04T11:45:11.4353731Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:45:11.4366279Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:45:11.4368672Z Running distributed/test_c10d_logger 1/1 ... [2025-12-04 11:45:11.436789][2288210.085972641]
2025-12-04T11:45:11.4368892Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:45:11.4370789Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_logger.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:45:11.436969]
2025-12-04T11:45:20.3148493Z 
2025-12-04T11:45:20.3149363Z distributed/test_c10d_logger 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_logger_1.1_564604c60adf8385_.log
2025-12-04T11:45:20.3151338Z Running 2 items in this shard: test/distributed/test_c10d_logger.py::C10dErrorLoggerTest::test_exception_logger, test/distributed/test_c10d_logger.py::C10dErrorLoggerTest::test_get_or_create_logger
2025-12-04T11:45:20.3151985Z 
2025-12-04T11:45:20.3152274Z Finished distributed/test_c10d_logger 1/1 ... [2025-12-04 11:45:20.314497][2288218.963677815], took 0.15min
2025-12-04T11:45:20.3156973Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:45:20.3174334Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:45:20.3178687Z Running distributed/_composable/test_replicate_training 1/1 ... [2025-12-04 11:45:20.317532][2288218.966715952]
2025-12-04T11:45:20.3179079Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:45:20.3179832Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/test_replicate_training.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:45:20.317711]
2025-12-04T11:47:31.1177505Z 
2025-12-04T11:47:31.1178718Z distributed/_composable/test_replicate_training 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.test_replicate_training_1.1_f26ae4680a21c31a_.log
2025-12-04T11:47:31.1186537Z Running 17 items in this shard: test/distributed/_composable/test_replicate_training.py::TestReplicateForwardInputs::test_root_move_forward_input_to_device, test/distributed/_composable/test_replicate_training.py::TestReplicateRegisteredParams::test_param_registration_after_backward, test/distributed/_composable/test_replicate_training.py::TestReplicateRegisteredParams::test_param_registration_after_forward, test/distributed/_composable/test_replicate_training.py::TestReplicateCastAfterInit::test_to_float64_after_init, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_explicit_prefetching, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_multi_forward_module, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_non_root_forward_backward, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_post_optim_event, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_train_parity_multi_group_cpu_offload_eager, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_train_parity_multi_groups, test/distributed/_composable/test_replicate_training.py::TestReplicate1DTrainingCore::test_train_parity_single_group, test/distributed/_composable/test_replicate_training.py::TestReplicateTrainingCompose::test_train_parity_with_activation_checkpointing, test/distributed/_composable/test_replicate_training.py::TestReplicateSharedParams::test_train_parity_with_shared_params, test/distributed/_composable/test_replicate_training.py::TestReplicateGradientAccumulation::test_1f1b_microbatching, test/distributed/_composable/test_replicate_training.py::TestReplicateGradientAccumulation::test_gradient_accumulation, test/distributed/_composable/test_replicate_training.py::TestReplicateCustomForwardMethod::test_register_fsdp_forward_method, test/distributed/_composable/test_replicate_training.py::TestReplicateTPTraining::test_replicate_tp
2025-12-04T11:47:31.1191380Z 
2025-12-04T11:47:31.1191617Z Finished distributed/_composable/test_replicate_training 1/1 ... [2025-12-04 11:47:31.117438][2288349.766617311], took 2.18min
2025-12-04T11:47:31.1192326Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:47:31.1207621Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:47:31.1211631Z Running distributed/optim/test_apply_optimizer_in_backward 1/1 ... [2025-12-04 11:47:31.120941][2288349.770125278]
2025-12-04T11:47:31.1212734Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:47:31.1213649Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/optim/test_apply_optimizer_in_backward.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:31.121115]
2025-12-04T11:47:32.3532358Z 
2025-12-04T11:47:32.3533260Z distributed/optim/test_apply_optimizer_in_backward 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.optim.test_apply_optimizer_in_backward_1.1_2e4a72c6e91ee59d_.log
2025-12-04T11:47:32.3533828Z 
2025-12-04T11:47:32.3534102Z Finished distributed/optim/test_apply_optimizer_in_backward 1/1 ... [2025-12-04 11:47:32.352920][2288351.002102835], took 0.02min
2025-12-04T11:47:32.3541141Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:47:32.3558799Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:47:32.3560667Z Running distributed/fsdp/test_fsdp_uneven 1/1 ... [2025-12-04 11:47:32.355914][2288351.005097361]
2025-12-04T11:47:32.3560971Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:47:32.3562798Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_uneven.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:47:32.356049]
2025-12-04T11:48:05.4641859Z 
2025-12-04T11:48:05.4642860Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven 1/1 (test/test-reports/distributed.fsdp.test_fsdp_uneven_1.1_73d54334789787ed_.log)
2025-12-04T11:48:05.4644249Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-0cec19ea9b3dfbff.xml
2025-12-04T11:48:05.4645172Z ============================= test session starts ==============================
2025-12-04T11:48:05.4645802Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:48:05.4646359Z cachedir: .pytest_cache
2025-12-04T11:48:05.4647004Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:48:05.4647678Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:48:05.4648001Z configfile: pytest.ini
2025-12-04T11:48:05.4648712Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:48:05.4649385Z collecting ... collected 1 item
2025-12-04T11:48:05.4649765Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T11:48:05.4650350Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda
2025-12-04T11:48:05.4650545Z 
2025-12-04T11:48:05.4650839Z distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda I1204 11:47:34.036000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 335486
2025-12-04T11:48:05.4651320Z I1204 11:47:34.037000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 335487
2025-12-04T11:48:05.4651762Z I1204 11:47:34.037000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 335488
2025-12-04T11:48:05.4652107Z I1204 11:47:34.038000 335417 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 335489
2025-12-04T11:48:05.4652446Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4653356Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4653857Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4654382Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4654889Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4655346Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4655792Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4656262Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4656730Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4657307Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4657772Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4658279Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4658735Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4659207Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4659889Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4660507Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4660865Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4661434Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4661919Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4662287Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4662703Z [rank1]:E1204 11:47:40.189000 335487 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:48:05.4662946Z dist init r=1, world=4
2025-12-04T11:48:05.4663191Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4663528Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4664020Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4664498Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4664973Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4665421Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4665859Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4666361Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4666822Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4667282Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4667750Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4668236Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4668690Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4669153Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4669792Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2243952640 and is now 3240099840.
2025-12-04T11:48:05.4670390Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4670737Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4671299Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4671778Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4672187Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4672601Z [rank3]:E1204 11:47:40.199000 335489 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:48:05.4672843Z dist init r=3, world=4
2025-12-04T11:48:05.4673046Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4673383Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4673867Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4674346Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4674823Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4675268Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4675738Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4676202Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4676663Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4677130Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4677590Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4678040Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4678530Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4678996Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4679630Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488.
2025-12-04T11:48:05.4680227Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4680575Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4681134Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4681652Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4682016Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4682428Z [rank2]:E1204 11:47:40.202000 335488 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:48:05.4682669Z dist init r=2, world=4
2025-12-04T11:48:05.4682869Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4683203Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4683691Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4684168Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4684648Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4685146Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4685583Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4686047Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4686508Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4686970Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4687432Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4687881Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4688376Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4688841Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4689474Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040.
2025-12-04T11:48:05.4690073Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4690420Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4691009Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4691485Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4691849Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4692260Z [rank0]:E1204 11:47:40.265000 335486 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:48:05.4692501Z dist init r=0, world=4
2025-12-04T11:48:05.4692916Z [rank0]:[W1204 11:47:40.176653701 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:48:05.4693327Z FAILED [7.9118s] [100%]
2025-12-04T11:48:05.4693393Z 
2025-12-04T11:48:05.4693453Z =================================== FAILURES ===================================
2025-12-04T11:48:05.4693673Z _______________ TestUnevenParamShardCUDA.test_one_iteration_cuda _______________
2025-12-04T11:48:05.4693849Z Traceback (most recent call last):
2025-12-04T11:48:05.4694097Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:48:05.4694341Z     self._join_processes(fn)
2025-12-04T11:48:05.4694584Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:48:05.4694849Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:48:05.4695118Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:48:05.4695376Z     raise RuntimeError(error)
2025-12-04T11:48:05.4695528Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:48:05.4695691Z Traceback (most recent call last):
2025-12-04T11:48:05.4695930Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4696169Z     getattr(self, test_name)()
2025-12-04T11:48:05.4696399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4696629Z     fn()
2025-12-04T11:48:05.4696831Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4697060Z     method(*args, **kwargs)
2025-12-04T11:48:05.4697281Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4697510Z     method(*args, **kwargs)
2025-12-04T11:48:05.4697727Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4697951Z     with policy():
2025-12-04T11:48:05.4698213Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4698444Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4698833Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4699192Z 
2025-12-04T11:48:05.4699266Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4699615Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4699857Z 
2025-12-04T11:48:05.4699945Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4700072Z 
2025-12-04T11:48:05.4700074Z 
2025-12-04T11:48:05.4700155Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:48:05.4700359Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:48:05.4700726Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-0cec19ea9b3dfbff.xml -
2025-12-04T11:48:05.4701064Z =========================== short test summary info ============================
2025-12-04T11:48:05.4701388Z FAILED [7.9118s] distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:48:05.4701693Z Traceback (most recent call last):
2025-12-04T11:48:05.4701936Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4702179Z     getattr(self, test_name)()
2025-12-04T11:48:05.4702409Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4702672Z     fn()
2025-12-04T11:48:05.4702873Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4703108Z     method(*args, **kwargs)
2025-12-04T11:48:05.4703328Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4703559Z     method(*args, **kwargs)
2025-12-04T11:48:05.4703775Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4704004Z     with policy():
2025-12-04T11:48:05.4704217Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4704447Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4704842Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4705200Z 
2025-12-04T11:48:05.4705275Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4705587Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4705825Z 
2025-12-04T11:48:05.4705916Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4706104Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:48:05.4706262Z ============================== 1 failed in 7.92s ===============================
2025-12-04T11:48:05.4706393Z Got exit code 1
2025-12-04T11:48:05.4706490Z Retrying single test...
2025-12-04T11:48:05.4706755Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-39cddf3a330f88b6.xml
2025-12-04T11:48:05.4707043Z ============================= test session starts ==============================
2025-12-04T11:48:05.4707253Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:48:05.4707440Z cachedir: .pytest_cache
2025-12-04T11:48:05.4707663Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:48:05.4707902Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:48:05.4708020Z configfile: pytest.ini
2025-12-04T11:48:05.4708336Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:48:05.4708578Z collecting ... collected 1 item
2025-12-04T11:48:05.4708846Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda
2025-12-04T11:48:05.4709122Z Running 1 items in this shard
2025-12-04T11:48:05.4709193Z 
2025-12-04T11:48:05.4709479Z distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda I1204 11:47:44.379000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 335888
2025-12-04T11:48:05.4709948Z I1204 11:47:44.380000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 335889
2025-12-04T11:48:05.4710293Z I1204 11:47:44.380000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 335890
2025-12-04T11:48:05.4710633Z I1204 11:47:44.381000 335819 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 335891
2025-12-04T11:48:05.4710962Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4711333Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4711824Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4712309Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4712786Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4713231Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4713673Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4714134Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4714594Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4715060Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4715524Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4715976Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4716430Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4716891Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4717568Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488.
2025-12-04T11:48:05.4718211Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4718563Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4719129Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4719608Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4719974Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4720391Z [rank2]:E1204 11:47:50.418000 335890 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:48:05.4720669Z dist init r=2, world=4
2025-12-04T11:48:05.4720874Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4721210Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4721704Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4722192Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4722674Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4732679Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4733142Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4733627Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4734105Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4734576Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4735053Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4735514Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4736052Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4736522Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4737165Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040.
2025-12-04T11:48:05.4737769Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4738119Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4738733Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4739213Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4739615Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4740031Z [rank0]:E1204 11:47:50.428000 335888 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:48:05.4740278Z dist init r=0, world=4
2025-12-04T11:48:05.4740488Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4740831Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4741320Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4741803Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4742279Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4742726Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4743170Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4743638Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4744101Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4744564Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4745025Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4745507Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4745962Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4746431Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4747067Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3240099840.
2025-12-04T11:48:05.4747662Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4748011Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4748619Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4749136Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4749502Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4749915Z [rank3]:E1204 11:47:50.431000 335891 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:48:05.4750156Z dist init r=3, world=4
2025-12-04T11:48:05.4750363Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4750699Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4751185Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4751664Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4752144Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4752591Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4753029Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4753495Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4753960Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4754420Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4754913Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4755364Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4755820Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4756286Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4756922Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4757516Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4757863Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4758496Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4758973Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4759337Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4759751Z [rank1]:E1204 11:47:50.433000 335889 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:48:05.4759995Z dist init r=1, world=4
2025-12-04T11:48:05.4760398Z [rank0]:[W1204 11:47:50.275869666 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:48:05.4760811Z FAILED [7.9132s] [100%]
2025-12-04T11:48:05.4760878Z 
2025-12-04T11:48:05.4760939Z =================================== FAILURES ===================================
2025-12-04T11:48:05.4761130Z _______________ TestUnevenParamShardCUDA.test_one_iteration_cuda _______________
2025-12-04T11:48:05.4761305Z Traceback (most recent call last):
2025-12-04T11:48:05.4761555Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:48:05.4761799Z     self._join_processes(fn)
2025-12-04T11:48:05.4762045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:48:05.4762313Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:48:05.4762582Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:48:05.4762843Z     raise RuntimeError(error)
2025-12-04T11:48:05.4762995Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:48:05.4763157Z Traceback (most recent call last):
2025-12-04T11:48:05.4763398Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4763639Z     getattr(self, test_name)()
2025-12-04T11:48:05.4763912Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4764146Z     fn()
2025-12-04T11:48:05.4764348Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4764582Z     method(*args, **kwargs)
2025-12-04T11:48:05.4764805Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4765036Z     method(*args, **kwargs)
2025-12-04T11:48:05.4765253Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4765481Z     with policy():
2025-12-04T11:48:05.4765693Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4765923Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4766317Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040.
2025-12-04T11:48:05.4766678Z 
2025-12-04T11:48:05.4766796Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4767111Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4767351Z 
2025-12-04T11:48:05.4767440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4767566Z 
2025-12-04T11:48:05.4767627Z Process 2 exited with error code 10 and exception:
2025-12-04T11:48:05.4767767Z Traceback (most recent call last):
2025-12-04T11:48:05.4768010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4768289Z     getattr(self, test_name)()
2025-12-04T11:48:05.4768521Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4768752Z     fn()
2025-12-04T11:48:05.4768956Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4769187Z     method(*args, **kwargs)
2025-12-04T11:48:05.4769405Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4769633Z     method(*args, **kwargs)
2025-12-04T11:48:05.4769850Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4770074Z     with policy():
2025-12-04T11:48:05.4770287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4770518Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4770908Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488.
2025-12-04T11:48:05.4771265Z 
2025-12-04T11:48:05.4771341Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4771653Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4771889Z 
2025-12-04T11:48:05.4771979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4772102Z 
2025-12-04T11:48:05.4772104Z 
2025-12-04T11:48:05.4772185Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:48:05.4772423Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:48:05.4772790Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-39cddf3a330f88b6.xml -
2025-12-04T11:48:05.4773130Z =========================== short test summary info ============================
2025-12-04T11:48:05.4773456Z FAILED [7.9132s] distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:48:05.4773760Z Traceback (most recent call last):
2025-12-04T11:48:05.4774005Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4774250Z     getattr(self, test_name)()
2025-12-04T11:48:05.4774482Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4774718Z     fn()
2025-12-04T11:48:05.4774920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4775153Z     method(*args, **kwargs)
2025-12-04T11:48:05.4775371Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4775632Z     method(*args, **kwargs)
2025-12-04T11:48:05.4775849Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4776074Z     with policy():
2025-12-04T11:48:05.4776284Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4776513Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4776908Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040.
2025-12-04T11:48:05.4777264Z 
2025-12-04T11:48:05.4777338Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4777647Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4777887Z 
2025-12-04T11:48:05.4777973Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4778098Z 
2025-12-04T11:48:05.4778192Z Process 2 exited with error code 10 and exception:
2025-12-04T11:48:05.4778333Z Traceback (most recent call last):
2025-12-04T11:48:05.4778576Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4778817Z     getattr(self, test_name)()
2025-12-04T11:48:05.4779049Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4779280Z     fn()
2025-12-04T11:48:05.4779480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4779711Z     method(*args, **kwargs)
2025-12-04T11:48:05.4779930Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4780158Z     method(*args, **kwargs)
2025-12-04T11:48:05.4780375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4780600Z     with policy():
2025-12-04T11:48:05.4780810Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4781041Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4781463Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488.
2025-12-04T11:48:05.4781817Z 
2025-12-04T11:48:05.4781893Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4782200Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4782437Z 
2025-12-04T11:48:05.4782523Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4782711Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:48:05.4782871Z ============================== 1 failed in 7.92s ===============================
2025-12-04T11:48:05.4783003Z Got exit code 1
2025-12-04T11:48:05.4783101Z Retrying single test...
2025-12-04T11:48:05.4783366Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-6a36a936c5172dbc.xml
2025-12-04T11:48:05.4783657Z ============================= test session starts ==============================
2025-12-04T11:48:05.4783908Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:48:05.4784101Z cachedir: .pytest_cache
2025-12-04T11:48:05.4784328Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:48:05.4784570Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:48:05.4784693Z configfile: pytest.ini
2025-12-04T11:48:05.4784924Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:48:05.4785170Z collecting ... collected 1 item
2025-12-04T11:48:05.4785444Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda
2025-12-04T11:48:05.4785720Z Running 1 items in this shard
2025-12-04T11:48:05.4785792Z 
2025-12-04T11:48:05.4786081Z distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda I1204 11:47:54.603000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 336290
2025-12-04T11:48:05.4786556Z I1204 11:47:54.604000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 336291
2025-12-04T11:48:05.4786900Z I1204 11:47:54.604000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 336292
2025-12-04T11:48:05.4787240Z I1204 11:47:54.605000 336221 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 336293
2025-12-04T11:48:05.4787573Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4787913Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4788459Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4788947Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4789428Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4789881Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4790445Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4790918Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4791386Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4791851Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4792318Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4792772Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4793228Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4793727Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4794372Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3290431488.
2025-12-04T11:48:05.4794973Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4795326Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4795893Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4796375Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4796746Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4797168Z [rank2]:E1204 11:48:00.764000 336292 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:48:05.4797414Z dist init r=2, world=4
2025-12-04T11:48:05.4797621Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4797963Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4798484Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4798965Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4799489Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4799944Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4800389Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4800852Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4801316Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4801780Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4802244Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4802729Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4803182Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4803651Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4804294Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4804893Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4805243Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4805805Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4806284Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4806652Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4807066Z [rank1]:E1204 11:48:00.768000 336291 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:48:05.4807310Z dist init r=1, world=4
2025-12-04T11:48:05.4807514Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4807851Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4808385Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4808896Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4809380Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4809832Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4810272Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4810739Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4811208Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4811673Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4812169Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4812623Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4813080Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4813546Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4814187Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2243952640 and is now 3240099840.
2025-12-04T11:48:05.4814784Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4815135Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4815701Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4816185Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4816555Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4816972Z [rank3]:E1204 11:48:00.776000 336293 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:48:05.4817214Z dist init r=3, world=4
2025-12-04T11:48:05.4817417Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:48:05.4817757Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:48:05.4818310Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4818795Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:48:05.4819277Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4819731Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:48:05.4820174Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4820643Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4821109Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4821601Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:48:05.4822064Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4822520Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:48:05.4822976Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4823442Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:48:05.4824086Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3449815040.
2025-12-04T11:48:05.4824682Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4825034Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4825595Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4826074Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:48:05.4826439Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4826855Z [rank0]:E1204 11:48:00.797000 336290 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:48:05.4827098Z dist init r=0, world=4
2025-12-04T11:48:05.4827523Z [rank0]:[W1204 11:48:01.665824755 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:48:05.4827940Z FAILED [8.0123s] [100%]
2025-12-04T11:48:05.4828010Z 
2025-12-04T11:48:05.4828067Z =================================== FAILURES ===================================
2025-12-04T11:48:05.4828296Z _______________ TestUnevenParamShardCUDA.test_one_iteration_cuda _______________
2025-12-04T11:48:05.4828475Z Traceback (most recent call last):
2025-12-04T11:48:05.4828725Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:48:05.4828972Z     self._join_processes(fn)
2025-12-04T11:48:05.4829223Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:48:05.4829492Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:48:05.4829762Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:48:05.4830024Z     raise RuntimeError(error)
2025-12-04T11:48:05.4830211Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:48:05.4830375Z Traceback (most recent call last):
2025-12-04T11:48:05.4830619Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4830864Z     getattr(self, test_name)()
2025-12-04T11:48:05.4831097Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4831333Z     fn()
2025-12-04T11:48:05.4831536Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4831773Z     method(*args, **kwargs)
2025-12-04T11:48:05.4831996Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4832228Z     method(*args, **kwargs)
2025-12-04T11:48:05.4832450Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4832678Z     with policy():
2025-12-04T11:48:05.4832892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4833125Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4833517Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4833872Z 
2025-12-04T11:48:05.4833949Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4834264Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4834505Z 
2025-12-04T11:48:05.4834597Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4834725Z 
2025-12-04T11:48:05.4834727Z 
2025-12-04T11:48:05.4834805Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:48:05.4835010Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:48:05.4835379Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-6a36a936c5172dbc.xml -
2025-12-04T11:48:05.4835717Z =========================== short test summary info ============================
2025-12-04T11:48:05.4836077Z FAILED [8.0123s] distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:48:05.4836383Z Traceback (most recent call last):
2025-12-04T11:48:05.4836631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:48:05.4836878Z     getattr(self, test_name)()
2025-12-04T11:48:05.4837113Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:48:05.4837348Z     fn()
2025-12-04T11:48:05.4837551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4837782Z     method(*args, **kwargs)
2025-12-04T11:48:05.4838004Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:48:05.4838278Z     method(*args, **kwargs)
2025-12-04T11:48:05.4838500Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:48:05.4838732Z     with policy():
2025-12-04T11:48:05.4838945Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:48:05.4839220Z     raise RuntimeError(msg)
2025-12-04T11:48:05.4839612Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestUnevenParamShardCUDA.test_one_iteration_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3307208704.
2025-12-04T11:48:05.4839965Z 
2025-12-04T11:48:05.4840042Z To execute this test, run the following from the base repo dir:
2025-12-04T11:48:05.4840358Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_uneven.py TestUnevenParamShardCUDA.test_one_iteration_cuda
2025-12-04T11:48:05.4840595Z 
2025-12-04T11:48:05.4840687Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:48:05.4840877Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:48:05.4841039Z ============================== 1 failed in 8.02s ===============================
2025-12-04T11:48:05.4841174Z Got exit code 1
2025-12-04T11:48:05.4841386Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda
2025-12-04T11:48:05.4841700Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:48:05.4842064Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-b3a2e625edae2d2f.xml
2025-12-04T11:48:05.4842358Z ============================= test session starts ==============================
2025-12-04T11:48:05.4842570Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:48:05.4842759Z cachedir: .pytest_cache
2025-12-04T11:48:05.4842980Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:48:05.4843221Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:48:05.4843337Z configfile: pytest.ini
2025-12-04T11:48:05.4843561Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:48:05.4843829Z collecting ... collected 1 item / 1 deselected / 0 selected
2025-12-04T11:48:05.4843988Z stepcurrent: skipping 1 already run items.
2025-12-04T11:48:05.4844120Z Running 0 items in this shard
2025-12-04T11:48:05.4844192Z 
2025-12-04T11:48:05.4844433Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_uneven/distributed.fsdp.test_fsdp_uneven-b3a2e625edae2d2f.xml -
2025-12-04T11:48:05.4844806Z ============================ 1 deselected in 0.00s =============================
2025-12-04T11:48:05.4845083Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_uneven.py::TestUnevenParamShardCUDA::test_one_iteration_cuda']
2025-12-04T11:48:05.4845298Z 
2025-12-04T11:48:05.4845488Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_uneven 1/1 (test/test-reports/distributed.fsdp.test_fsdp_uneven_1.1_73d54334789787ed_.log)
2025-12-04T11:48:05.4845712Z 
2025-12-04T11:48:05.4845838Z Finished distributed/fsdp/test_fsdp_uneven 1/1 ... [2025-12-04 11:48:05.464076][2288384.113256447], took 0.55min
2025-12-04T11:48:05.4846260Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:48:05.4846652Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:48:05.4846873Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T11:48:05.4847053Z Uploading artifacts took 0.00 seconds
2025-12-04T11:48:05.4847189Z distributed/fsdp/test_fsdp_uneven 1/1 failed!
2025-12-04T11:48:05.4847391Z Running distributed/tensor/test_op_strategy 1/1 ... [2025-12-04 11:48:05.466924][2288384.11610755]
2025-12-04T11:48:05.4847617Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:48:05.4848018Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_op_strategy.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:05.467106]
2025-12-04T11:48:30.4684854Z 
2025-12-04T11:48:30.4686004Z distributed/tensor/test_op_strategy 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_op_strategy_1.1_f65ccb2b3fdeb576_.log
2025-12-04T11:48:30.4694305Z Running 24 items in this shard: test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_batch_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_bmm_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_free_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumDims::test_mm_dims, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_diffinndim_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_bmm_diffoutndim_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_linearity_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_mm_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_mm_2d_mesh, test/distributed/tensor/test_op_strategy.py::TestEinsumStrategies::test_pointwise_1d_mesh, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_bmm_strategies, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_mm_strategies, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_redistribute_cost_latency, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_redistribute_cost_mesh_1d, test/distributed/tensor/test_op_strategy.py::TestCostModel::test_redistribute_cost_mesh_2d, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTest::test_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTest::test_tuple_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::TestStrategyHashing::test_call_with_different_nontensor_args, test/distributed/tensor/test_op_strategy.py::TestStrategyOperation::test_cache_clean, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTestWithLocalTensor::test_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::DistTensorReplicateStrategyRegistrationTestWithLocalTensor::test_tuple_replicate_strategy_placement, test/distributed/tensor/test_op_strategy.py::TestStrategyHashingWithLocalTensor::test_call_with_different_nontensor_args
2025-12-04T11:48:30.4697707Z 
2025-12-04T11:48:30.4697843Z Finished distributed/tensor/test_op_strategy 1/1 ... [2025-12-04 11:48:30.468065][2288409.117244593], took 0.42min
2025-12-04T11:48:30.4698308Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:48:30.4707633Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:48:30.4711209Z Running distributed/fsdp/test_fsdp_grad_acc 1/1 ... [2025-12-04 11:48:30.471001][2288409.120183193]
2025-12-04T11:48:30.4711419Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:48:30.4712967Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_grad_acc.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:48:30.471210]
2025-12-04T11:49:23.1167237Z 
2025-12-04T11:49:23.1169396Z distributed/fsdp/test_fsdp_grad_acc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_grad_acc_1.1_6157c2e534b414ab_.log
2025-12-04T11:49:23.1175396Z Running 6 items in this shard: test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs0_use_orig_params_False, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs0_use_orig_params_True, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs1_use_orig_params_False, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_configs1_use_orig_params_True, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_cpu_offload_use_orig_params_False, test/distributed/fsdp/test_fsdp_grad_acc.py::TestGradAcc::test_grad_acc_cpu_offload_use_orig_params_True
2025-12-04T11:49:23.1178882Z 
2025-12-04T11:49:23.1179578Z Finished distributed/fsdp/test_fsdp_grad_acc 1/1 ... [2025-12-04 11:49:23.116467][2288461.765646529], took 0.88min
2025-12-04T11:49:23.1180691Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:49:23.1198550Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:49:23.1201573Z Running distributed/checkpoint/test_state_dict_stager 1/1 ... [2025-12-04 11:49:23.120002][2288461.769185945]
2025-12-04T11:49:23.1201883Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:49:23.1203028Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_state_dict_stager.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:49:23.120185]
2025-12-04T11:49:48.6741757Z 
2025-12-04T11:49:48.6747049Z distributed/checkpoint/test_state_dict_stager 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_stager_1.1_18563662566f98e7_.log
2025-12-04T11:49:48.6753449Z Running 14 items in this shard: test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_caching, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_complex_storage_sharing, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_cpu_storage_independence, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_dataclasses, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_different_dtypes, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_empty_tensors, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_tensor_attrs, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_tensor_pinned_and_shared, test/distributed/checkpoint/test_state_dict_stager.py::TestStateDictStager::test_views, test/distributed/checkpoint/test_state_dict_stager.py::TestDTensorStateDictStager::test_dtensor, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_basic, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_dtensors, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_persistence, test/distributed/checkpoint/test_state_dict_stager.py::TestReplicationStager::test_replication_sharded_tensors
2025-12-04T11:49:48.6757639Z 
2025-12-04T11:49:48.6757940Z Finished distributed/checkpoint/test_state_dict_stager 1/1 ... [2025-12-04 11:49:48.673912][2288487.323090275], took 0.43min
2025-12-04T11:49:48.6758876Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:49:48.6773100Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:49:48.6775872Z Running distributed/fsdp/test_fsdp_freezing_weights 1/1 ... [2025-12-04 11:49:48.677483][2288487.326667078]
2025-12-04T11:49:48.6776155Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:49:48.6777807Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_freezing_weights.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:49:48.677660]
2025-12-04T11:53:50.3187145Z 
2025-12-04T11:53:50.3188408Z distributed/fsdp/test_fsdp_freezing_weights 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_freezing_weights_1.1_ca4a55d16ff319d1_.log
2025-12-04T11:53:50.3205731Z Running 32 items in this shard: test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_False_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_GradToNone_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_False_disable_autograd_True_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_False_forward_prefetch_True, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_False, test/distributed/fsdp/test_fsdp_freezing_weights.py::TestFreezingWeights::test_freezing_weights_with_nested_trunk_True_freezing_method_FreezingMethod_RequiresGrad_freeze_after_wrap_fsdp_True_disable_autograd_True_forward_prefetch_True
2025-12-04T11:53:50.3217372Z 
2025-12-04T11:53:50.3217522Z Finished distributed/fsdp/test_fsdp_freezing_weights 1/1 ... [2025-12-04 11:53:50.318573][2288728.967751208], took 4.03min
2025-12-04T11:53:50.3217984Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:53:50.3218463Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:53:50.3221583Z Running distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2025-12-04 11:53:50.322032][2288728.971215445]
2025-12-04T11:53:50.3221818Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:53:50.3223300Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_composable/fsdp/test_fully_shard_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:53:50.322216]
2025-12-04T11:54:03.0561568Z 
2025-12-04T11:54:03.0562452Z distributed/_composable/fsdp/test_fully_shard_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._composable.fsdp.test_fully_shard_init_1.1_b06e8c3d530e8d5f_.log
2025-12-04T11:54:03.0574456Z Running 42 items in this shard: test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_ignored_param_device, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceTensor::test_move_states_to_device_tensor, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_invalid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardDeviceDTensor::test_move_states_to_device_dtensor_valid, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_2d_mesh_without_mesh_dim_names, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMeshArg::test_invalid_mesh_ndim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_duplicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_nested_fully_shard_and_replicate, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_modules_single, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_nested_fully_shard, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardManagedModulesAndStates::test_managed_states_shared_params_and_buffers, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_duplicates, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_list_of_mlps, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardParamModuleInfos::test_get_param_module_infos_shared_params, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_noncontiguous_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_raise_scalar_parameter, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterTensor::test_shard_tensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardedParameterDTensor::test_shard_dtensor_parameters, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_double_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_is_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_module_and_param_fqns, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_fully_shard_multi_module_root, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardLazyInit::test_reset_sharded_param_in_lazy_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_invalid_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_1d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_meta_device_2d_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMetaDeviceInit::test_rank0_broadcast_meta_device_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_1d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardProcessGroupInit::test_2d_process_group_init, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardHSDPBroadcast::test_hsdp_broadcast_across_replicas, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestHSDPWithCustomHook::test_custom_hook_custom_stream, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestHSDPWithCustomHook::test_custom_hsdp_all_reduce_hook, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_dim_neg1, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_transformer_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_1d_uneven_shard_largest_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_init_2d_transformer_shard_diff_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardShardPlacementFn::test_invalid_shard_dim, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardOldImport::test_old_import_training, test/distributed/_composable/fsdp/test_fully_shard_init.py::TestFullyShardMixedDtypeParam::test_mixed_dtypes_no_grad_param
2025-12-04T11:54:03.0583346Z 
2025-12-04T11:54:03.0583532Z Finished distributed/_composable/fsdp/test_fully_shard_init 1/1 ... [2025-12-04 11:54:03.055909][2288741.705087673], took 0.21min
2025-12-04T11:54:03.0584064Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:54:03.0592278Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:54:03.0595675Z Running distributed/fsdp/test_fsdp_exec_order 1/1 ... [2025-12-04 11:54:03.059427][2288741.708610368]
2025-12-04T11:54:03.0595885Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:54:03.0597357Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_exec_order.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:54:03.059613]
2025-12-04T11:58:24.4280998Z 
2025-12-04T11:58:24.4281610Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order 1/1 (test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_e994e873868c2dab_.log)
2025-12-04T11:58:24.4282610Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-208acf942d1af133.xml
2025-12-04T11:58:24.4282927Z ============================= test session starts ==============================
2025-12-04T11:58:24.4283157Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4283354Z cachedir: .pytest_cache
2025-12-04T11:58:24.4283583Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4283828Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4283961Z configfile: pytest.ini
2025-12-04T11:58:24.4284194Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4284449Z collecting ... collected 8 items
2025-12-04T11:58:24.4284612Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T11:58:24.4286237Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.4287841Z 
2025-12-04T11:58:24.4288458Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:54:04.744000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 351925
2025-12-04T11:58:24.4288987Z I1204 11:54:04.744000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 351926
2025-12-04T11:58:24.4289500Z I1204 11:54:04.745000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 351927
2025-12-04T11:58:24.4289838Z I1204 11:54:04.746000 351856 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 351928
2025-12-04T11:58:24.4290529Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4291157Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4291748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4292331Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4292912Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4293544Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4294123Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4294707Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4294970Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4295372Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4295871Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4296352Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4296835Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4297290Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4297736Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4298262Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4298731Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4299235Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4299699Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4300161Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4300640Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4301121Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4301805Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4314816Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4315203Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4315849Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4316450Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4316845Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4317271Z [rank0]:E1204 11:54:09.948000 351925 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4317522Z dist init r=0, world=4
2025-12-04T11:58:24.4317740Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4318086Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4318647Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4319135Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4319620Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4320074Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4320519Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4321070Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4321537Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4322006Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4322474Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4322928Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4323387Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4323860Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4324580Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4325217Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4325573Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4326187Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4326720Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4327089Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4327507Z [rank2]:E1204 11:54:09.950000 351927 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4327753Z dist init r=2, world=4
2025-12-04T11:58:24.4327961Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4328335Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4328823Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4329310Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4329791Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4330240Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4330732Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4331199Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4331666Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4332129Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4332597Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4333050Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4333507Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4334017Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4334691Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4335330Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4335679Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4336286Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4336807Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4337174Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4337593Z [rank1]:E1204 11:54:09.953000 351926 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4337838Z dist init r=1, world=4
2025-12-04T11:58:24.4338043Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4338435Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4338924Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4339404Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4339921Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4340377Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4340820Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4341286Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4341751Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4342217Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4342683Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4343183Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4343642Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4344110Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4344786Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816.
2025-12-04T11:58:24.4345418Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4345769Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4346380Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4346903Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4347271Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4347687Z [rank3]:E1204 11:54:10.006000 351928 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4347933Z dist init r=3, world=4
2025-12-04T11:58:24.4348039Z FAILED [6.1123s] [ 12%]
2025-12-04T11:58:24.4348105Z 
2025-12-04T11:58:24.4348203Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4348411Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __
2025-12-04T11:58:24.4348602Z Traceback (most recent call last):
2025-12-04T11:58:24.4348853Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4349145Z     self._join_processes(fn)
2025-12-04T11:58:24.4349395Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4349664Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4349937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4350203Z     raise RuntimeError(error)
2025-12-04T11:58:24.4350363Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4350531Z Traceback (most recent call last):
2025-12-04T11:58:24.4350776Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4351022Z     getattr(self, test_name)()
2025-12-04T11:58:24.4351259Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4351498Z     fn()
2025-12-04T11:58:24.4351704Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4351994Z     method(*args, **kwargs)
2025-12-04T11:58:24.4352218Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4352451Z     method(*args, **kwargs)
2025-12-04T11:58:24.4352672Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4352902Z     with policy():
2025-12-04T11:58:24.4353118Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4353351Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4353783Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4354176Z 
2025-12-04T11:58:24.4354259Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4354622Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4354902Z 
2025-12-04T11:58:24.4354997Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4355126Z 
2025-12-04T11:58:24.4355128Z 
2025-12-04T11:58:24.4355210Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4355417Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4355802Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-208acf942d1af133.xml -
2025-12-04T11:58:24.4356153Z =========================== short test summary info ============================
2025-12-04T11:58:24.4356523Z FAILED [6.1123s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4356865Z Traceback (most recent call last):
2025-12-04T11:58:24.4357115Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4357362Z     getattr(self, test_name)()
2025-12-04T11:58:24.4357598Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4357835Z     fn()
2025-12-04T11:58:24.4358071Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4358347Z     method(*args, **kwargs)
2025-12-04T11:58:24.4358568Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4358802Z     method(*args, **kwargs)
2025-12-04T11:58:24.4359023Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4359250Z     with policy():
2025-12-04T11:58:24.4359465Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4359699Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4360132Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4360528Z 
2025-12-04T11:58:24.4360604Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4360964Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4361296Z 
2025-12-04T11:58:24.4361390Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4361581Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4361742Z ============================== 1 failed in 6.12s ===============================
2025-12-04T11:58:24.4361873Z Got exit code 1
2025-12-04T11:58:24.4361974Z Retrying single test...
2025-12-04T11:58:24.4362254Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-22d3b0b8730091a0.xml
2025-12-04T11:58:24.4362558Z ============================= test session starts ==============================
2025-12-04T11:58:24.4362774Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4362967Z cachedir: .pytest_cache
2025-12-04T11:58:24.4363194Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4363435Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4363559Z configfile: pytest.ini
2025-12-04T11:58:24.4363792Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4364068Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4364421Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4364743Z Running 1 items in this shard
2025-12-04T11:58:24.4364817Z 
2025-12-04T11:58:24.4365144Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:54:13.436000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 352303
2025-12-04T11:58:24.4365663Z I1204 11:54:13.437000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 352304
2025-12-04T11:58:24.4366010Z I1204 11:54:13.438000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 352305
2025-12-04T11:58:24.4366353Z I1204 11:54:13.439000 352234 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 352306
2025-12-04T11:58:24.4367093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4367688Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4368326Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4368914Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4369506Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4370125Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4370709Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4371291Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4371531Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4371876Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4372366Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4372849Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4373331Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4373780Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4374226Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4374693Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4375158Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4375619Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4376117Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4376570Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4377025Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4377489Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4378218Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2243952640 and is now 3005218816.
2025-12-04T11:58:24.4378855Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4379204Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4379847Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4380371Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4380735Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4381152Z [rank3]:E1204 11:54:18.754000 352306 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4381396Z dist init r=3, world=4
2025-12-04T11:58:24.4381601Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4381940Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4382425Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4382905Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4383385Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4383832Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4384272Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4384736Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4385197Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4385697Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4386159Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4386610Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4387062Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4387530Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4388250Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4388915Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4389262Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4389864Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4390386Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4390748Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4391163Z [rank1]:E1204 11:54:18.758000 352304 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4391403Z dist init r=1, world=4
2025-12-04T11:58:24.4391604Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4391940Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4392427Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4392909Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4393388Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4393838Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4394276Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4394773Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4395235Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4395700Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4396160Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4396609Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4397062Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4397527Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4398248Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4398910Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4399258Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4399866Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4400387Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4400750Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4401163Z [rank2]:E1204 11:54:18.763000 352305 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4401404Z dist init r=2, world=4
2025-12-04T11:58:24.4401606Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4401943Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4402428Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4402908Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4403385Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4403831Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4404302Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4404764Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4405227Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4405686Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4406147Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4406597Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4407055Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4407541Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4408260Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016.
2025-12-04T11:58:24.4408890Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4409238Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4409845Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4410363Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4410725Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4411138Z [rank0]:E1204 11:54:18.770000 352303 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4411378Z dist init r=0, world=4
2025-12-04T11:58:24.4411481Z FAILED [6.3109s] [100%]
2025-12-04T11:58:24.4411548Z 
2025-12-04T11:58:24.4411606Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4411807Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __
2025-12-04T11:58:24.4411991Z Traceback (most recent call last):
2025-12-04T11:58:24.4412234Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4412474Z     self._join_processes(fn)
2025-12-04T11:58:24.4412718Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4412979Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4413278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4413540Z     raise RuntimeError(error)
2025-12-04T11:58:24.4413691Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.4413852Z Traceback (most recent call last):
2025-12-04T11:58:24.4414090Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4414331Z     getattr(self, test_name)()
2025-12-04T11:58:24.4414559Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4414788Z     fn()
2025-12-04T11:58:24.4414989Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4415221Z     method(*args, **kwargs)
2025-12-04T11:58:24.4415441Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4415669Z     method(*args, **kwargs)
2025-12-04T11:58:24.4415884Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4416144Z     with policy():
2025-12-04T11:58:24.4416352Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4416579Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4417020Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2243952640 and is now 3005218816.
2025-12-04T11:58:24.4417409Z 
2025-12-04T11:58:24.4417484Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4417841Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4418121Z 
2025-12-04T11:58:24.4418232Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4418357Z 
2025-12-04T11:58:24.4418359Z 
2025-12-04T11:58:24.4418437Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4418636Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4419008Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-22d3b0b8730091a0.xml -
2025-12-04T11:58:24.4419350Z =========================== short test summary info ============================
2025-12-04T11:58:24.4419708Z FAILED [6.3109s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.4420045Z Traceback (most recent call last):
2025-12-04T11:58:24.4420287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4420528Z     getattr(self, test_name)()
2025-12-04T11:58:24.4420759Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4420988Z     fn()
2025-12-04T11:58:24.4421186Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4421413Z     method(*args, **kwargs)
2025-12-04T11:58:24.4421627Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4421899Z     method(*args, **kwargs)
2025-12-04T11:58:24.4422115Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4422336Z     with policy():
2025-12-04T11:58:24.4422546Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4422777Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4423203Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2243952640 and is now 3005218816.
2025-12-04T11:58:24.4423595Z 
2025-12-04T11:58:24.4423672Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4424029Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4424310Z 
2025-12-04T11:58:24.4424397Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4424585Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4424787Z ======================= 1 failed, 7 deselected in 6.32s ========================
2025-12-04T11:58:24.4424923Z Got exit code 1
2025-12-04T11:58:24.4425018Z Retrying single test...
2025-12-04T11:58:24.4425295Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-caab704f2611f4a9.xml
2025-12-04T11:58:24.4425597Z ============================= test session starts ==============================
2025-12-04T11:58:24.4425809Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4425998Z cachedir: .pytest_cache
2025-12-04T11:58:24.4426225Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4426464Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4426583Z configfile: pytest.ini
2025-12-04T11:58:24.4426812Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4427079Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4427421Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4427731Z Running 1 items in this shard
2025-12-04T11:58:24.4427805Z 
2025-12-04T11:58:24.4428131Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda I1204 11:54:22.070000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 352681
2025-12-04T11:58:24.4428684Z I1204 11:54:22.070000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 352682
2025-12-04T11:58:24.4429027Z I1204 11:54:22.071000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 352683
2025-12-04T11:58:24.4429365Z I1204 11:54:22.072000 352612 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 352684
2025-12-04T11:58:24.4430052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4430680Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4431263Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4431844Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4432424Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4433002Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4433585Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4434208Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4434448Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4434793Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4435287Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4435768Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4436249Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4436700Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4437139Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4437602Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4438065Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4438567Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4439029Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4439478Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4439971Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4440434Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4441105Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4441738Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4442087Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4442695Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4443253Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4443617Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4444031Z [rank1]:E1204 11:54:27.309000 352682 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4444273Z dist init r=1, world=4
2025-12-04T11:58:24.4444475Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4444812Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4445296Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4445776Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4446253Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4446698Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4447139Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4447601Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4448070Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4448575Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4449035Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4449524Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4449977Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4450442Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4451110Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816.
2025-12-04T11:58:24.4451740Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4452087Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4452736Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4453256Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4453619Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4454032Z [rank3]:E1204 11:54:27.330000 352684 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4454273Z dist init r=3, world=4
2025-12-04T11:58:24.4454475Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4454813Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4455299Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4455782Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4456261Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4456707Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4457146Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4457607Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4458070Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4458611Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4459071Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4459522Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4459974Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4460438Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4461204Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4461873Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4462220Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4462828Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4463349Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4463713Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4464126Z [rank2]:E1204 11:54:27.332000 352683 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4464366Z dist init r=2, world=4
2025-12-04T11:58:24.4464566Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4464901Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4465386Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4465865Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4466345Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4466793Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4467231Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4467691Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4468422Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4468883Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4469345Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4469798Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4470252Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4470714Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4471380Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016.
2025-12-04T11:58:24.4472040Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4472394Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4473003Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4473525Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4473888Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4474301Z [rank0]:E1204 11:54:27.335000 352681 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4474541Z dist init r=0, world=4
2025-12-04T11:58:24.4474644Z FAILED [6.2112s] [100%]
2025-12-04T11:58:24.4474709Z 
2025-12-04T11:58:24.4474766Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4474967Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda __
2025-12-04T11:58:24.4475154Z Traceback (most recent call last):
2025-12-04T11:58:24.4475399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4475645Z     self._join_processes(fn)
2025-12-04T11:58:24.4475892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4476158Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4476426Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4476685Z     raise RuntimeError(error)
2025-12-04T11:58:24.4476840Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4477004Z Traceback (most recent call last):
2025-12-04T11:58:24.4477271Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4477514Z     getattr(self, test_name)()
2025-12-04T11:58:24.4477748Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4477981Z     fn()
2025-12-04T11:58:24.4478238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4478469Z     method(*args, **kwargs)
2025-12-04T11:58:24.4478691Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4478921Z     method(*args, **kwargs)
2025-12-04T11:58:24.4479138Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4479363Z     with policy():
2025-12-04T11:58:24.4479578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4479810Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4480252Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4480682Z 
2025-12-04T11:58:24.4480758Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4481115Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4481397Z 
2025-12-04T11:58:24.4481488Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4481612Z 
2025-12-04T11:58:24.4481616Z 
2025-12-04T11:58:24.4481695Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4481897Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4482273Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-caab704f2611f4a9.xml -
2025-12-04T11:58:24.4482620Z =========================== short test summary info ============================
2025-12-04T11:58:24.4482981Z FAILED [6.2112s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4483321Z Traceback (most recent call last):
2025-12-04T11:58:24.4483567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4483812Z     getattr(self, test_name)()
2025-12-04T11:58:24.4484045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4484278Z     fn()
2025-12-04T11:58:24.4484482Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4484712Z     method(*args, **kwargs)
2025-12-04T11:58:24.4484930Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4485159Z     method(*args, **kwargs)
2025-12-04T11:58:24.4485375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4485599Z     with policy():
2025-12-04T11:58:24.4485811Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4486082Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4486508Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4486900Z 
2025-12-04T11:58:24.4486977Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4487334Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4487614Z 
2025-12-04T11:58:24.4487702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4487891Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4488058Z ======================= 1 failed, 7 deselected in 6.22s ========================
2025-12-04T11:58:24.4488244Z Got exit code 1
2025-12-04T11:58:24.4488497Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda
2025-12-04T11:58:24.4488897Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.4489268Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-32a0f1d064cd3c3f.xml
2025-12-04T11:58:24.4489567Z ============================= test session starts ==============================
2025-12-04T11:58:24.4489777Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4489968Z cachedir: .pytest_cache
2025-12-04T11:58:24.4490195Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4490435Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4490555Z configfile: pytest.ini
2025-12-04T11:58:24.4490783Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4491056Z collecting ... collected 8 items / 1 deselected / 7 selected
2025-12-04T11:58:24.4491217Z stepcurrent: skipping 1 already run items.
2025-12-04T11:58:24.4491349Z Running 7 items in this shard
2025-12-04T11:58:24.4491424Z 
2025-12-04T11:58:24.4491749Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:54:30.734000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 353059
2025-12-04T11:58:24.4492259Z I1204 11:54:30.735000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 353060
2025-12-04T11:58:24.4492605Z I1204 11:54:30.736000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 353061
2025-12-04T11:58:24.4492944Z I1204 11:54:30.736000 352990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 353062
2025-12-04T11:58:24.4493636Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4494227Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4494851Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4495435Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4496017Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4496598Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4497176Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4497756Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4498021Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4498415Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4498906Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4499387Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4499870Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4500317Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4500755Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4501306Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4502045Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4502813Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4503540Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4504253Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4504962Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4505676Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4506762Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016.
2025-12-04T11:58:24.4507766Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4508380Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4509345Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4509950Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4510477Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4511187Z [rank0]:E1204 11:54:35.926000 353059 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4511576Z dist init r=0, world=4
2025-12-04T11:58:24.4511885Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4512426Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4513023Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4513596Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4514077Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4514588Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4515029Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4515494Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4515957Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4516419Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4516878Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4517329Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4517828Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4518344Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4519016Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4519647Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4519998Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4520604Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4521163Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4521528Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4521942Z [rank2]:E1204 11:54:35.927000 353061 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4522185Z dist init r=2, world=4
2025-12-04T11:58:24.4522389Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4522729Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4523215Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4523696Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4524173Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4524621Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4525058Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4525523Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4525984Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4526443Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4526957Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4527407Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4527863Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4528390Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4529063Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4529693Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4530040Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4530680Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4531202Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4531569Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4531983Z [rank1]:E1204 11:54:35.928000 353060 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4532225Z dist init r=1, world=4
2025-12-04T11:58:24.4532426Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4532766Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4533256Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4533734Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4534214Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4534662Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4535104Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4535567Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4536028Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4536522Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4536986Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4537441Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4537897Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4538401Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4539074Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816.
2025-12-04T11:58:24.4539742Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4540089Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4540694Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4541212Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4541576Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4541990Z [rank3]:E1204 11:54:35.929000 353062 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4542231Z dist init r=3, world=4
2025-12-04T11:58:24.4542333Z FAILED [6.2120s] [ 14%]
2025-12-04T11:58:24.4542399Z 
2025-12-04T11:58:24.4542458Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4542657Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __
2025-12-04T11:58:24.4542844Z Traceback (most recent call last):
2025-12-04T11:58:24.4543093Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4543339Z     self._join_processes(fn)
2025-12-04T11:58:24.4543586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4543855Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4544125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4544385Z     raise RuntimeError(error)
2025-12-04T11:58:24.4544541Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4544705Z Traceback (most recent call last):
2025-12-04T11:58:24.4544945Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4545186Z     getattr(self, test_name)()
2025-12-04T11:58:24.4545454Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4545687Z     fn()
2025-12-04T11:58:24.4545890Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4546124Z     method(*args, **kwargs)
2025-12-04T11:58:24.4546344Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4546573Z     method(*args, **kwargs)
2025-12-04T11:58:24.4546791Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4547017Z     with policy():
2025-12-04T11:58:24.4547231Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4547463Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4547897Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016.
2025-12-04T11:58:24.4548353Z 
2025-12-04T11:58:24.4548431Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4548788Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4549067Z 
2025-12-04T11:58:24.4549159Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4549283Z 
2025-12-04T11:58:24.4549347Z Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4549490Z Traceback (most recent call last):
2025-12-04T11:58:24.4549734Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4549979Z     getattr(self, test_name)()
2025-12-04T11:58:24.4550211Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4550445Z     fn()
2025-12-04T11:58:24.4550647Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4550876Z     method(*args, **kwargs)
2025-12-04T11:58:24.4551094Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4551322Z     method(*args, **kwargs)
2025-12-04T11:58:24.4551538Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4551762Z     with policy():
2025-12-04T11:58:24.4551975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4552205Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4552628Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4553018Z 
2025-12-04T11:58:24.4553095Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4553451Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4553730Z 
2025-12-04T11:58:24.4553821Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4553944Z 
2025-12-04T11:58:24.4554044Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.4554187Z Traceback (most recent call last):
2025-12-04T11:58:24.4554429Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4554671Z     getattr(self, test_name)()
2025-12-04T11:58:24.4554906Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4555138Z     fn()
2025-12-04T11:58:24.4555339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4555568Z     method(*args, **kwargs)
2025-12-04T11:58:24.4555787Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4556015Z     method(*args, **kwargs)
2025-12-04T11:58:24.4556236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4556462Z     with policy():
2025-12-04T11:58:24.4556674Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4556952Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4557378Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4557769Z 
2025-12-04T11:58:24.4557844Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4558235Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4558518Z 
2025-12-04T11:58:24.4558609Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4558735Z 
2025-12-04T11:58:24.4558737Z 
2025-12-04T11:58:24.4558815Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4559021Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4559397Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-32a0f1d064cd3c3f.xml -
2025-12-04T11:58:24.4559742Z =========================== short test summary info ============================
2025-12-04T11:58:24.4560104Z FAILED [6.2120s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4560444Z Traceback (most recent call last):
2025-12-04T11:58:24.4560691Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4560934Z     getattr(self, test_name)()
2025-12-04T11:58:24.4561167Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4561403Z     fn()
2025-12-04T11:58:24.4561604Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4561834Z     method(*args, **kwargs)
2025-12-04T11:58:24.4562052Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4562281Z     method(*args, **kwargs)
2025-12-04T11:58:24.4562499Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4562727Z     with policy():
2025-12-04T11:58:24.4562980Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4563214Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4563647Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2459959296 and is now 3214934016.
2025-12-04T11:58:24.4570906Z 
2025-12-04T11:58:24.4570998Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4571371Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4571660Z 
2025-12-04T11:58:24.4571754Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4571887Z 
2025-12-04T11:58:24.4571954Z Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4572106Z Traceback (most recent call last):
2025-12-04T11:58:24.4572362Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4572674Z     getattr(self, test_name)()
2025-12-04T11:58:24.4572910Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4573144Z     fn()
2025-12-04T11:58:24.4573349Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4573582Z     method(*args, **kwargs)
2025-12-04T11:58:24.4573803Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4574033Z     method(*args, **kwargs)
2025-12-04T11:58:24.4574255Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4574482Z     with policy():
2025-12-04T11:58:24.4574694Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4574930Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4575363Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4575763Z 
2025-12-04T11:58:24.4575838Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4576196Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4576482Z 
2025-12-04T11:58:24.4576571Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4576697Z 
2025-12-04T11:58:24.4576762Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.4576905Z Traceback (most recent call last):
2025-12-04T11:58:24.4577154Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4577399Z     getattr(self, test_name)()
2025-12-04T11:58:24.4577633Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4577868Z     fn()
2025-12-04T11:58:24.4578069Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4578393Z     method(*args, **kwargs)
2025-12-04T11:58:24.4578647Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4578880Z     method(*args, **kwargs)
2025-12-04T11:58:24.4579095Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4579328Z     with policy():
2025-12-04T11:58:24.4579538Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4579770Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4580198Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4580591Z 
2025-12-04T11:58:24.4580669Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4581030Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4581309Z 
2025-12-04T11:58:24.4581400Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4581627Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4581797Z ======================= 1 failed, 1 deselected in 6.22s ========================
2025-12-04T11:58:24.4581941Z Got exit code 1
2025-12-04T11:58:24.4582044Z Retrying single test...
2025-12-04T11:58:24.4582319Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-064375b06a4c88cb.xml
2025-12-04T11:58:24.4582619Z ============================= test session starts ==============================
2025-12-04T11:58:24.4582837Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4583027Z cachedir: .pytest_cache
2025-12-04T11:58:24.4583254Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4583497Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4583620Z configfile: pytest.ini
2025-12-04T11:58:24.4583855Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4584128Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4584476Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4584793Z Running 1 items in this shard
2025-12-04T11:58:24.4584866Z 
2025-12-04T11:58:24.4585194Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:54:39.457000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 353437
2025-12-04T11:58:24.4585712Z I1204 11:54:39.458000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 353438
2025-12-04T11:58:24.4586061Z I1204 11:54:39.459000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 353439
2025-12-04T11:58:24.4586404Z I1204 11:54:39.459000 353368 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 353440
2025-12-04T11:58:24.4587118Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4587708Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4588325Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4588913Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4589500Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4590080Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4590662Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4591285Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4591529Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4591875Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4592369Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4592857Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4593341Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4593792Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4594234Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4594700Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4595167Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4595634Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4596101Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4596554Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4597047Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4597511Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4598231Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4598869Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4599225Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4599834Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4600403Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4600775Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4601191Z [rank0]:E1204 11:54:44.806000 353437 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4601437Z dist init r=0, world=4
2025-12-04T11:58:24.4601644Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4601986Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4602473Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4602955Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4603434Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4603885Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4604326Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4604795Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4605258Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4605720Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4606221Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4606672Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4607130Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4607599Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4608317Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816.
2025-12-04T11:58:24.4608952Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4609303Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4609953Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4610475Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4610844Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4611260Z [rank3]:E1204 11:54:44.811000 353440 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4611503Z dist init r=3, world=4
2025-12-04T11:58:24.4611709Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4612052Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4612537Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4613019Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4613502Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4613949Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4614388Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4614851Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4615352Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4615817Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4616280Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4616736Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4617189Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4617656Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4618384Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4619053Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4619404Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4620012Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4620533Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4620900Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4621328Z [rank2]:E1204 11:54:44.822000 353439 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4621572Z dist init r=2, world=4
2025-12-04T11:58:24.4621775Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4622113Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4622602Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4623080Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4623560Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4624007Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4624448Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4624949Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4625415Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4625875Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4626338Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4626786Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4627243Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4627712Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4628448Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4629077Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4629427Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4630033Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4630554Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4630920Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4631330Z [rank1]:E1204 11:54:44.860000 353438 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4631568Z dist init r=1, world=4
2025-12-04T11:58:24.4631668Z FAILED [6.3131s] [100%]
2025-12-04T11:58:24.4631731Z 
2025-12-04T11:58:24.4631790Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4631987Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __
2025-12-04T11:58:24.4632169Z Traceback (most recent call last):
2025-12-04T11:58:24.4632412Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4632652Z     self._join_processes(fn)
2025-12-04T11:58:24.4632895Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4633156Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4633421Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4633680Z     raise RuntimeError(error)
2025-12-04T11:58:24.4633878Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4634038Z Traceback (most recent call last):
2025-12-04T11:58:24.4634276Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4634517Z     getattr(self, test_name)()
2025-12-04T11:58:24.4634750Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4634979Z     fn()
2025-12-04T11:58:24.4635178Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4635406Z     method(*args, **kwargs)
2025-12-04T11:58:24.4635624Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4635849Z     method(*args, **kwargs)
2025-12-04T11:58:24.4636067Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4636291Z     with policy():
2025-12-04T11:58:24.4636502Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4636772Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4637196Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4637586Z 
2025-12-04T11:58:24.4637660Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4638014Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4638337Z 
2025-12-04T11:58:24.4638429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4638553Z 
2025-12-04T11:58:24.4638554Z 
2025-12-04T11:58:24.4638633Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4638835Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4639208Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-064375b06a4c88cb.xml -
2025-12-04T11:58:24.4639551Z =========================== short test summary info ============================
2025-12-04T11:58:24.4639908Z FAILED [6.3131s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4640245Z Traceback (most recent call last):
2025-12-04T11:58:24.4640489Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4640729Z     getattr(self, test_name)()
2025-12-04T11:58:24.4640959Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4641189Z     fn()
2025-12-04T11:58:24.4641389Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4641615Z     method(*args, **kwargs)
2025-12-04T11:58:24.4641830Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4642055Z     method(*args, **kwargs)
2025-12-04T11:58:24.4642269Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4642490Z     with policy():
2025-12-04T11:58:24.4642734Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4642962Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4643386Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4643781Z 
2025-12-04T11:58:24.4643856Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4644209Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4644487Z 
2025-12-04T11:58:24.4644575Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4644763Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4644924Z ======================= 1 failed, 7 deselected in 6.32s ========================
2025-12-04T11:58:24.4645059Z Got exit code 1
2025-12-04T11:58:24.4645154Z Retrying single test...
2025-12-04T11:58:24.4645461Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-86744d037db0ba9d.xml
2025-12-04T11:58:24.4645756Z ============================= test session starts ==============================
2025-12-04T11:58:24.4645965Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4646150Z cachedir: .pytest_cache
2025-12-04T11:58:24.4646373Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4646608Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4646725Z configfile: pytest.ini
2025-12-04T11:58:24.4646953Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4647218Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4647560Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4647873Z Running 1 items in this shard
2025-12-04T11:58:24.4647945Z 
2025-12-04T11:58:24.4648311Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda I1204 11:54:48.351000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 353815
2025-12-04T11:58:24.4648822Z I1204 11:54:48.352000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 353816
2025-12-04T11:58:24.4649164Z I1204 11:54:48.352000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 353817
2025-12-04T11:58:24.4649501Z I1204 11:54:48.353000 353746 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 353818
2025-12-04T11:58:24.4650188Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4650773Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4651397Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4651976Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4652554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4653127Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4653702Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4654274Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4654545Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4654884Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4655371Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4655848Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4656330Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4656775Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4657213Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4657673Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4658134Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4658629Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4659090Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4659537Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4659988Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4660447Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4661160Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2462056448 and is now 3214934016.
2025-12-04T11:58:24.4661794Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4662141Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4662744Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4663262Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4663624Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4664078Z [rank0]:E1204 11:54:53.735000 353815 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4664318Z dist init r=0, world=4
2025-12-04T11:58:24.4664517Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4664851Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4665334Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4665810Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4666288Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4666734Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4667169Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4667630Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4668088Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4668588Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4669045Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4669492Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4669983Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4670443Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4671115Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4671741Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4672088Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4672690Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4673241Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4673600Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4674010Z [rank2]:E1204 11:54:53.735000 353817 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4674247Z dist init r=2, world=4
2025-12-04T11:58:24.4674448Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4674782Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4675263Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4675742Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4676220Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4676666Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4677104Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4677565Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4678023Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4678525Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4679020Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4679467Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4679917Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4680380Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4681052Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 3. CUDA driver allocated memory was 2250244096 and is now 3005218816.
2025-12-04T11:58:24.4681678Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4682023Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4682661Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4683178Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4683538Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4683949Z [rank3]:E1204 11:54:53.747000 353818 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4684187Z dist init r=3, world=4
2025-12-04T11:58:24.4684385Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4684720Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4685201Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4685679Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4686155Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4686600Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4687039Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4687497Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4687956Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4688500Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4688960Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4689410Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4689860Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4690320Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4690989Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 2317352960 and is now 3072327680.
2025-12-04T11:58:24.4691648Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4691994Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4692594Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4693111Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4693471Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4693882Z [rank1]:E1204 11:54:53.785000 353816 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4694119Z dist init r=1, world=4
2025-12-04T11:58:24.4694221Z FAILED [6.4126s] [100%]
2025-12-04T11:58:24.4694285Z 
2025-12-04T11:58:24.4694342Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4694539Z _ TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda __
2025-12-04T11:58:24.4694722Z Traceback (most recent call last):
2025-12-04T11:58:24.4694967Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4695208Z     self._join_processes(fn)
2025-12-04T11:58:24.4695453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4695721Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4695987Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4696244Z     raise RuntimeError(error)
2025-12-04T11:58:24.4696397Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.4696558Z Traceback (most recent call last):
2025-12-04T11:58:24.4696797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4697036Z     getattr(self, test_name)()
2025-12-04T11:58:24.4697289Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4697327Z     fn()
2025-12-04T11:58:24.4697479Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4697525Z     method(*args, **kwargs)
2025-12-04T11:58:24.4697676Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4697718Z     method(*args, **kwargs)
2025-12-04T11:58:24.4697868Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4697907Z     with policy():
2025-12-04T11:58:24.4698059Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4698102Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4698508Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4698548Z 
2025-12-04T11:58:24.4698624Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4698870Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4698872Z 
2025-12-04T11:58:24.4698960Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4698963Z 
2025-12-04T11:58:24.4698964Z 
2025-12-04T11:58:24.4699042Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4699130Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4699384Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-86744d037db0ba9d.xml -
2025-12-04T11:58:24.4699447Z =========================== short test summary info ============================
2025-12-04T11:58:24.4699708Z FAILED [6.4126s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.4699756Z Traceback (most recent call last):
2025-12-04T11:58:24.4699921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4699965Z     getattr(self, test_name)()
2025-12-04T11:58:24.4700124Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4700160Z     fn()
2025-12-04T11:58:24.4700311Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4700353Z     method(*args, **kwargs)
2025-12-04T11:58:24.4700503Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4700546Z     method(*args, **kwargs)
2025-12-04T11:58:24.4700695Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4700733Z     with policy():
2025-12-04T11:58:24.4700883Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4700925Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4701323Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 2. CUDA driver allocated memory was 2300575744 and is now 3055550464.
2025-12-04T11:58:24.4701326Z 
2025-12-04T11:58:24.4701402Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4701647Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4701649Z 
2025-12-04T11:58:24.4701736Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4701800Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4701862Z ======================= 1 failed, 7 deselected in 6.42s ========================
2025-12-04T11:58:24.4701901Z Got exit code 1
2025-12-04T11:58:24.4702096Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda
2025-12-04T11:58:24.4702225Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.4702433Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4c027be4a8a991b6.xml
2025-12-04T11:58:24.4702517Z ============================= test session starts ==============================
2025-12-04T11:58:24.4702631Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4702674Z cachedir: .pytest_cache
2025-12-04T11:58:24.4702833Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4702880Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4702922Z configfile: pytest.ini
2025-12-04T11:58:24.4703088Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4703160Z collecting ... collected 8 items / 2 deselected / 6 selected
2025-12-04T11:58:24.4703217Z stepcurrent: skipping 2 already run items.
2025-12-04T11:58:24.4703261Z Running 6 items in this shard
2025-12-04T11:58:24.4703266Z 
2025-12-04T11:58:24.4703625Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:54:57.439000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 354193
2025-12-04T11:58:24.4703781Z I1204 11:54:57.440000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 354194
2025-12-04T11:58:24.4703935Z I1204 11:54:57.441000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 354195
2025-12-04T11:58:24.4704087Z I1204 11:54:57.441000 354124 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 354196
2025-12-04T11:58:24.4704588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4704654Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4705144Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4705227Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4705717Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4705777Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4706264Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4706323Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4706468Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4706632Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4706946Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4707103Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4707392Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4707520Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4707797Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4707947Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4708254Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4708402Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4708679Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4708815Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4709093Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4709242Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4709796Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4709913Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4710111Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4710527Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4710643Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4710858Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4711023Z [rank0]:E1204 11:55:04.487000 354193 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4711102Z dist init r=0, world=4
2025-12-04T11:58:24.4711241Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4711401Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4711687Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4711843Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4712127Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4712253Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4712529Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4712677Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4712955Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4713103Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4713379Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4713516Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4713793Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4713967Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4714482Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4714599Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4714796Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4715207Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4715323Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4715560Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4715725Z [rank2]:E1204 11:55:04.499000 354195 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4715864Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4716025Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4716312Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4716467Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4716751Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4716874Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4717153Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4717302Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4717581Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4717727Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4718000Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4718225Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4718503Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4718654Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4719167Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4719284Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4719480Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4719888Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4720037Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4720247Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4720415Z [rank3]:E1204 11:55:04.499000 354196 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4720454Z dist init r=2, world=4
2025-12-04T11:58:24.4720494Z dist init r=3, world=4
2025-12-04T11:58:24.4720632Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4720793Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4721079Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4721232Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4721520Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4721644Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4721925Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4722073Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4722350Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4722533Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4722807Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4722945Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4723222Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4723371Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4723883Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4724020Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4724218Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4724624Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4724741Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4724951Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4725117Z [rank1]:E1204 11:55:04.510000 354194 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4725157Z dist init r=1, world=4
2025-12-04T11:58:24.4725509Z [rank0]:[W1204 11:55:04.319681436 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4725551Z FAILED [8.8171s] [ 16%]
2025-12-04T11:58:24.4725553Z 
2025-12-04T11:58:24.4725612Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4725750Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _
2025-12-04T11:58:24.4725797Z Traceback (most recent call last):
2025-12-04T11:58:24.4725962Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4726008Z     self._join_processes(fn)
2025-12-04T11:58:24.4726182Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4726235Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4726417Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4726460Z     raise RuntimeError(error)
2025-12-04T11:58:24.4726544Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4726611Z Traceback (most recent call last):
2025-12-04T11:58:24.4726774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4726817Z     getattr(self, test_name)()
2025-12-04T11:58:24.4726979Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4727014Z     fn()
2025-12-04T11:58:24.4727167Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4727207Z     method(*args, **kwargs)
2025-12-04T11:58:24.4727359Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4727400Z     method(*args, **kwargs)
2025-12-04T11:58:24.4727550Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4727589Z     with policy():
2025-12-04T11:58:24.4727742Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4727785Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4728253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4728256Z 
2025-12-04T11:58:24.4728333Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4728614Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4728616Z 
2025-12-04T11:58:24.4728707Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4728710Z 
2025-12-04T11:58:24.4728711Z 
2025-12-04T11:58:24.4728787Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4728878Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4729130Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-4c027be4a8a991b6.xml -
2025-12-04T11:58:24.4729191Z =========================== short test summary info ============================
2025-12-04T11:58:24.4729484Z FAILED [8.8171s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4729530Z Traceback (most recent call last):
2025-12-04T11:58:24.4729697Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4729739Z     getattr(self, test_name)()
2025-12-04T11:58:24.4729900Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4729936Z     fn()
2025-12-04T11:58:24.4730089Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4730129Z     method(*args, **kwargs)
2025-12-04T11:58:24.4730280Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4730319Z     method(*args, **kwargs)
2025-12-04T11:58:24.4730468Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4730505Z     with policy():
2025-12-04T11:58:24.4730690Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4730732Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4731125Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4731129Z 
2025-12-04T11:58:24.4731205Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4731485Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4731487Z 
2025-12-04T11:58:24.4731577Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4731639Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4731702Z ======================= 1 failed, 2 deselected in 8.83s ========================
2025-12-04T11:58:24.4731777Z Got exit code 1
2025-12-04T11:58:24.4731819Z Retrying single test...
2025-12-04T11:58:24.4732025Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ee549baed1036602.xml
2025-12-04T11:58:24.4732084Z ============================= test session starts ==============================
2025-12-04T11:58:24.4732196Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4732240Z cachedir: .pytest_cache
2025-12-04T11:58:24.4732400Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4732449Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4732491Z configfile: pytest.ini
2025-12-04T11:58:24.4732656Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4732730Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4733005Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4733050Z Running 1 items in this shard
2025-12-04T11:58:24.4733052Z 
2025-12-04T11:58:24.4733407Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:55:08.799000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 354595
2025-12-04T11:58:24.4733564Z I1204 11:55:08.800000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 354596
2025-12-04T11:58:24.4733716Z I1204 11:55:08.801000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 354597
2025-12-04T11:58:24.4733869Z I1204 11:55:08.801000 354526 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 354598
2025-12-04T11:58:24.4734366Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4734431Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4734974Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4735035Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4735521Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4735580Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4736068Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4736148Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4736291Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4736455Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4736744Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4736902Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4737189Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4737316Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4737593Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4737741Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4738021Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4738205Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4738482Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4738618Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4738897Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4739082Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4739599Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.4739717Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4739912Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4740322Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4740436Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4740688Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4740854Z [rank3]:E1204 11:55:15.952000 354598 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4740893Z dist init r=3, world=4
2025-12-04T11:58:24.4741033Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4741193Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4741479Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4741635Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4741920Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4742045Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4742323Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4742472Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4742750Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4742898Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4743172Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4743329Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4743607Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4743758Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4744270Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4744386Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4744583Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4745010Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4745125Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4745336Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4745503Z [rank2]:E1204 11:55:15.952000 354597 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4745544Z dist init r=2, world=4
2025-12-04T11:58:24.4745680Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4745842Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4746126Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4746280Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4746564Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4746689Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4746965Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4747113Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4747390Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4747567Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4747843Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4747980Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4748307Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4748455Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4748969Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4749238Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4749434Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4749844Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4749958Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4750171Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4750338Z [rank1]:E1204 11:55:15.957000 354596 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4750377Z dist init r=1, world=4
2025-12-04T11:58:24.4750515Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4750675Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4750964Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4751117Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4751404Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4751527Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4751805Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4751988Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4752265Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4752416Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4752691Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4752828Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4753106Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4753256Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4753789Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4753903Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4754102Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4754516Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4754633Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4754844Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4755010Z [rank0]:E1204 11:55:16.005000 354595 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4755051Z dist init r=0, world=4
2025-12-04T11:58:24.4755391Z [rank0]:[W1204 11:55:16.955971913 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4755432Z FAILED [9.0134s] [100%]
2025-12-04T11:58:24.4755434Z 
2025-12-04T11:58:24.4755493Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4755630Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _
2025-12-04T11:58:24.4755678Z Traceback (most recent call last):
2025-12-04T11:58:24.4755843Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4755888Z     self._join_processes(fn)
2025-12-04T11:58:24.4756062Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4756139Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4756320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4756364Z     raise RuntimeError(error)
2025-12-04T11:58:24.4756449Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.4756495Z Traceback (most recent call last):
2025-12-04T11:58:24.4756659Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4756701Z     getattr(self, test_name)()
2025-12-04T11:58:24.4756861Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4756896Z     fn()
2025-12-04T11:58:24.4757048Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4757090Z     method(*args, **kwargs)
2025-12-04T11:58:24.4757243Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4757285Z     method(*args, **kwargs)
2025-12-04T11:58:24.4757435Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4757504Z     with policy():
2025-12-04T11:58:24.4757656Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4757699Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4758090Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.4758093Z 
2025-12-04T11:58:24.4758210Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4758491Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4758494Z 
2025-12-04T11:58:24.4758585Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4758587Z 
2025-12-04T11:58:24.4758588Z 
2025-12-04T11:58:24.4758665Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4758753Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4759007Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-ee549baed1036602.xml -
2025-12-04T11:58:24.4759070Z =========================== short test summary info ============================
2025-12-04T11:58:24.4759365Z FAILED [9.0134s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.4759412Z Traceback (most recent call last):
2025-12-04T11:58:24.4759577Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4759620Z     getattr(self, test_name)()
2025-12-04T11:58:24.4759780Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4759815Z     fn()
2025-12-04T11:58:24.4759967Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4760007Z     method(*args, **kwargs)
2025-12-04T11:58:24.4760196Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4760237Z     method(*args, **kwargs)
2025-12-04T11:58:24.4760387Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4760426Z     with policy():
2025-12-04T11:58:24.4760580Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4760622Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4761012Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.4761014Z 
2025-12-04T11:58:24.4761092Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4761375Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4761407Z 
2025-12-04T11:58:24.4761497Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4761560Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4761625Z ======================= 1 failed, 7 deselected in 9.02s ========================
2025-12-04T11:58:24.4761662Z Got exit code 1
2025-12-04T11:58:24.4761704Z Retrying single test...
2025-12-04T11:58:24.4761910Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-09d71ce4d97b7d04.xml
2025-12-04T11:58:24.4761969Z ============================= test session starts ==============================
2025-12-04T11:58:24.4762083Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4762126Z cachedir: .pytest_cache
2025-12-04T11:58:24.4762284Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4762333Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4762376Z configfile: pytest.ini
2025-12-04T11:58:24.4762538Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4762612Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4762884Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4762929Z Running 1 items in this shard
2025-12-04T11:58:24.4762931Z 
2025-12-04T11:58:24.4763284Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda I1204 11:55:20.359000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 354997
2025-12-04T11:58:24.4763442Z I1204 11:55:20.360000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 354998
2025-12-04T11:58:24.4763595Z I1204 11:55:20.361000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 354999
2025-12-04T11:58:24.4763750Z I1204 11:55:20.361000 354928 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 355000
2025-12-04T11:58:24.4764270Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4764332Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4764826Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4764887Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4765374Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4765431Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4765938Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4765997Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4766141Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4766307Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4766597Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4766756Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4767041Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4767168Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4767448Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4767596Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4767876Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4768023Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4768348Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4768521Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4768800Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4768952Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4769469Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4769587Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4769783Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4770225Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4770341Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4770555Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4770722Z [rank0]:E1204 11:55:27.406000 354997 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4770762Z dist init r=0, world=4
2025-12-04T11:58:24.4770901Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4771061Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4771349Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4771502Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4771791Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4771917Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4772194Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4772345Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4772620Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4772790Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4773066Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4773206Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4773483Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4773632Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4774147Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4774284Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4774480Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4774885Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4775003Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4775217Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4775383Z [rank3]:E1204 11:55:27.409000 355000 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4775424Z dist init r=3, world=4
2025-12-04T11:58:24.4775562Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4775722Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4776011Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4776167Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4776450Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4776576Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4776852Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4777017Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4777297Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4777447Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4777724Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4777860Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4778140Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4778342Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4778858Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4779011Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4779206Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4779614Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4779731Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4779943Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4780107Z [rank2]:E1204 11:55:27.421000 354999 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4780146Z dist init r=2, world=4
2025-12-04T11:58:24.4780286Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4780445Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4780731Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4780887Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4781173Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4781296Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4781609Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4781760Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4782037Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4782186Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4782462Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4782600Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4782876Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4783047Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4783561Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4783675Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4783872Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4784282Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4784399Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4784612Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4784776Z [rank1]:E1204 11:55:27.427000 354998 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4784818Z dist init r=1, world=4
2025-12-04T11:58:24.4785158Z [rank0]:[W1204 11:55:27.239354586 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4785198Z FAILED [8.9149s] [100%]
2025-12-04T11:58:24.4785201Z 
2025-12-04T11:58:24.4785258Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4785394Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda _
2025-12-04T11:58:24.4785442Z Traceback (most recent call last):
2025-12-04T11:58:24.4785627Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4785672Z     self._join_processes(fn)
2025-12-04T11:58:24.4785846Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4785902Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4786081Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4786127Z     raise RuntimeError(error)
2025-12-04T11:58:24.4786209Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4786254Z Traceback (most recent call last):
2025-12-04T11:58:24.4786417Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4786460Z     getattr(self, test_name)()
2025-12-04T11:58:24.4786619Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4786655Z     fn()
2025-12-04T11:58:24.4786805Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4786868Z     method(*args, **kwargs)
2025-12-04T11:58:24.4787018Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4787060Z     method(*args, **kwargs)
2025-12-04T11:58:24.4787209Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4787247Z     with policy():
2025-12-04T11:58:24.4787397Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4787440Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4787829Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4787833Z 
2025-12-04T11:58:24.4787910Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4788235Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4788239Z 
2025-12-04T11:58:24.4788327Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4788330Z 
2025-12-04T11:58:24.4788331Z 
2025-12-04T11:58:24.4788407Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4788495Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4788746Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-09d71ce4d97b7d04.xml -
2025-12-04T11:58:24.4788809Z =========================== short test summary info ============================
2025-12-04T11:58:24.4789103Z FAILED [8.9149s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4789149Z Traceback (most recent call last):
2025-12-04T11:58:24.4789314Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4789356Z     getattr(self, test_name)()
2025-12-04T11:58:24.4789551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4789586Z     fn()
2025-12-04T11:58:24.4789738Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4789780Z     method(*args, **kwargs)
2025-12-04T11:58:24.4789931Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4789972Z     method(*args, **kwargs)
2025-12-04T11:58:24.4790121Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4790158Z     with policy():
2025-12-04T11:58:24.4790310Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4790351Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4790740Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4790771Z 
2025-12-04T11:58:24.4790846Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4791125Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4791127Z 
2025-12-04T11:58:24.4791216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4791280Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4791343Z ======================= 1 failed, 7 deselected in 8.92s ========================
2025-12-04T11:58:24.4791382Z Got exit code 1
2025-12-04T11:58:24.4791609Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4791739Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.4791949Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2ae3cfecb382b0b2.xml
2025-12-04T11:58:24.4792007Z ============================= test session starts ==============================
2025-12-04T11:58:24.4792119Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4792161Z cachedir: .pytest_cache
2025-12-04T11:58:24.4792318Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4792367Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4792409Z configfile: pytest.ini
2025-12-04T11:58:24.4792572Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4792645Z collecting ... collected 8 items / 3 deselected / 5 selected
2025-12-04T11:58:24.4792699Z stepcurrent: skipping 3 already run items.
2025-12-04T11:58:24.4792742Z Running 5 items in this shard
2025-12-04T11:58:24.4792744Z 
2025-12-04T11:58:24.4793100Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:55:31.957000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 355399
2025-12-04T11:58:24.4793254Z I1204 11:55:31.958000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 355400
2025-12-04T11:58:24.4793431Z I1204 11:55:31.959000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 355401
2025-12-04T11:58:24.4793584Z I1204 11:55:31.959000 355330 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 355402
2025-12-04T11:58:24.4794082Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4794144Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4794633Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4794694Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4795178Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4795267Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4795754Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4795811Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4795956Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4796119Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4796410Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4796567Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4796852Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4796978Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4797255Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4797403Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4797677Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4797846Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4798124Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4798306Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4798584Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4798731Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4799248Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4799396Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4799593Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4800001Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4800116Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4800330Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4800497Z [rank2]:E1204 11:55:39.098000 355401 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4800538Z dist init r=2, world=4
2025-12-04T11:58:24.4800676Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4800837Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4801127Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4801283Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4801569Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4801693Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4801968Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4802148Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4802427Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4802575Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4802852Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4802989Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4803266Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4803414Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4803946Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.4804063Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4804258Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4804664Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4804780Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4804992Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4805157Z [rank3]:E1204 11:55:39.100000 355402 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4805197Z dist init r=3, world=4
2025-12-04T11:58:24.4805337Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4805496Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4805786Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4805939Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4806225Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4806370Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4806645Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4806794Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4807069Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4807217Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4807493Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4807629Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4807926Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4808074Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4808628Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4808741Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4808940Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4809346Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4809460Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4809675Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4809839Z [rank0]:E1204 11:55:39.113000 355399 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4809884Z dist init r=0, world=4
2025-12-04T11:58:24.4810022Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4810184Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4810470Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4810668Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4810955Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4811081Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4811357Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4811506Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4811789Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4811937Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4812259Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4812398Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4812680Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4812834Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4813346Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4813463Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4813659Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4814068Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4814188Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4814405Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4814577Z [rank1]:E1204 11:55:39.115000 355400 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4814619Z dist init r=1, world=4
2025-12-04T11:58:24.4815285Z [rank0]:[W1204 11:55:39.995830833 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4815329Z FAILED [9.1148s] [ 20%]
2025-12-04T11:58:24.4815332Z 
2025-12-04T11:58:24.4815395Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4815533Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _
2025-12-04T11:58:24.4815589Z Traceback (most recent call last):
2025-12-04T11:58:24.4815755Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4815806Z     self._join_processes(fn)
2025-12-04T11:58:24.4819171Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4819234Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4819426Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4819472Z     raise RuntimeError(error)
2025-12-04T11:58:24.4819555Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4819604Z Traceback (most recent call last):
2025-12-04T11:58:24.4819771Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4819867Z     getattr(self, test_name)()
2025-12-04T11:58:24.4820027Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4820064Z     fn()
2025-12-04T11:58:24.4820218Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4820263Z     method(*args, **kwargs)
2025-12-04T11:58:24.4820414Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4820459Z     method(*args, **kwargs)
2025-12-04T11:58:24.4820610Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4820647Z     with policy():
2025-12-04T11:58:24.4820805Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4820846Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4821240Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4821243Z 
2025-12-04T11:58:24.4821321Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4821610Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4821612Z 
2025-12-04T11:58:24.4821703Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4821707Z 
2025-12-04T11:58:24.4821769Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.4821816Z Traceback (most recent call last):
2025-12-04T11:58:24.4821983Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4822028Z     getattr(self, test_name)()
2025-12-04T11:58:24.4822188Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4822225Z     fn()
2025-12-04T11:58:24.4822407Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4822448Z     method(*args, **kwargs)
2025-12-04T11:58:24.4822597Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4822640Z     method(*args, **kwargs)
2025-12-04T11:58:24.4822792Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4822832Z     with policy():
2025-12-04T11:58:24.4822984Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4823026Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4823417Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4823419Z 
2025-12-04T11:58:24.4823496Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4823778Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4823803Z 
2025-12-04T11:58:24.4823891Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4823893Z 
2025-12-04T11:58:24.4823955Z Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.4824001Z Traceback (most recent call last):
2025-12-04T11:58:24.4824167Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4824210Z     getattr(self, test_name)()
2025-12-04T11:58:24.4824375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4824410Z     fn()
2025-12-04T11:58:24.4824563Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4824604Z     method(*args, **kwargs)
2025-12-04T11:58:24.4824756Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4824796Z     method(*args, **kwargs)
2025-12-04T11:58:24.4824949Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4824987Z     with policy():
2025-12-04T11:58:24.4825140Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4825181Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4825572Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.4825576Z 
2025-12-04T11:58:24.4825652Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4825932Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4825934Z 
2025-12-04T11:58:24.4826023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4826025Z 
2025-12-04T11:58:24.4826027Z 
2025-12-04T11:58:24.4826106Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4826196Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4826473Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2ae3cfecb382b0b2.xml -
2025-12-04T11:58:24.4826537Z =========================== short test summary info ============================
2025-12-04T11:58:24.4826833Z FAILED [9.1148s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4826882Z Traceback (most recent call last):
2025-12-04T11:58:24.4827045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4827090Z     getattr(self, test_name)()
2025-12-04T11:58:24.4827251Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4827288Z     fn()
2025-12-04T11:58:24.4827439Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4827480Z     method(*args, **kwargs)
2025-12-04T11:58:24.4827657Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4827698Z     method(*args, **kwargs)
2025-12-04T11:58:24.4827848Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4827885Z     with policy():
2025-12-04T11:58:24.4828036Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4828078Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4828505Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4828507Z 
2025-12-04T11:58:24.4828582Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4828867Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4828869Z 
2025-12-04T11:58:24.4828959Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4828961Z 
2025-12-04T11:58:24.4829021Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.4829071Z Traceback (most recent call last):
2025-12-04T11:58:24.4829236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4829280Z     getattr(self, test_name)()
2025-12-04T11:58:24.4829439Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4829476Z     fn()
2025-12-04T11:58:24.4829628Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4829669Z     method(*args, **kwargs)
2025-12-04T11:58:24.4829819Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4829862Z     method(*args, **kwargs)
2025-12-04T11:58:24.4830012Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4830052Z     with policy():
2025-12-04T11:58:24.4830242Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4830287Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4830674Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4830678Z 
2025-12-04T11:58:24.4830751Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4831032Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4831034Z 
2025-12-04T11:58:24.4831122Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4831124Z 
2025-12-04T11:58:24.4831186Z Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.4831232Z Traceback (most recent call last):
2025-12-04T11:58:24.4831397Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4831439Z     getattr(self, test_name)()
2025-12-04T11:58:24.4831634Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4831669Z     fn()
2025-12-04T11:58:24.4831820Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4831860Z     method(*args, **kwargs)
2025-12-04T11:58:24.4832012Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4832053Z     method(*args, **kwargs)
2025-12-04T11:58:24.4832206Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4832243Z     with policy():
2025-12-04T11:58:24.4832398Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4832442Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4832832Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.4832834Z 
2025-12-04T11:58:24.4832908Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4833184Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4833188Z 
2025-12-04T11:58:24.4833275Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4833341Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4833617Z ======================= 1 failed, 3 deselected in 9.13s ========================
2025-12-04T11:58:24.4833656Z Got exit code 1
2025-12-04T11:58:24.4833700Z Retrying single test...
2025-12-04T11:58:24.4833910Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-6e9caf4cb6074b53.xml
2025-12-04T11:58:24.4833969Z ============================= test session starts ==============================
2025-12-04T11:58:24.4834083Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4834125Z cachedir: .pytest_cache
2025-12-04T11:58:24.4834306Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4834356Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4834396Z configfile: pytest.ini
2025-12-04T11:58:24.4834561Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4834640Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4834914Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4834961Z Running 1 items in this shard
2025-12-04T11:58:24.4834964Z 
2025-12-04T11:58:24.4835318Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:55:43.715000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 355801
2025-12-04T11:58:24.4835476Z I1204 11:55:43.716000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 355802
2025-12-04T11:58:24.4835628Z I1204 11:55:43.717000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 355803
2025-12-04T11:58:24.4835802Z I1204 11:55:43.717000 355732 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 355804
2025-12-04T11:58:24.4836300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4836365Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4836857Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4836919Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4837406Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4837463Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4837950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4838009Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4838189Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4838354Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4838685Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4838845Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4839133Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4839262Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4839540Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4839690Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4839967Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4840146Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4840430Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4840566Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4840847Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4840998Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4841522Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2462056448 and is now 3663724544.
2025-12-04T11:58:24.4841644Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4841840Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4842252Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4842369Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4842584Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4842752Z [rank0]:E1204 11:55:50.869000 355801 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4842792Z dist init r=0, world=4
2025-12-04T11:58:24.4842950Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4843109Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4843397Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4843553Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4843841Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4843966Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4844245Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4844393Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4844693Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4844841Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4845116Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4845253Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4845530Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4845681Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4846203Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4846317Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4846513Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4846923Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4847037Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4847267Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4847430Z [rank3]:E1204 11:55:50.881000 355804 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4847472Z dist init r=3, world=4
2025-12-04T11:58:24.4847611Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4847771Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4848060Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4848263Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4848550Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4848674Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4848987Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4849135Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4849412Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4849557Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4849835Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4849971Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4850251Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4850400Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4850918Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4851034Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4851230Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4851672Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4851786Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4852000Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4852168Z [rank2]:E1204 11:55:50.884000 355803 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4852207Z dist init r=2, world=4
2025-12-04T11:58:24.4852347Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4852505Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4852797Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4852950Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4853257Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4853380Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4853658Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4853808Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4854083Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4854233Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4854506Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4854644Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4854923Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4855072Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4855585Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4855700Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4855917Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4856324Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4856439Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4856649Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4856815Z [rank1]:E1204 11:55:50.927000 355802 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4856856Z dist init r=1, world=4
2025-12-04T11:58:24.4857199Z [rank0]:[W1204 11:55:51.700664178 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4857262Z FAILED [9.0131s] [100%]
2025-12-04T11:58:24.4857264Z 
2025-12-04T11:58:24.4857322Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4857459Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _
2025-12-04T11:58:24.4857505Z Traceback (most recent call last):
2025-12-04T11:58:24.4857671Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4857715Z     self._join_processes(fn)
2025-12-04T11:58:24.4857894Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4857949Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4858131Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4858216Z     raise RuntimeError(error)
2025-12-04T11:58:24.4858301Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4858348Z Traceback (most recent call last):
2025-12-04T11:58:24.4858511Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4858555Z     getattr(self, test_name)()
2025-12-04T11:58:24.4858713Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4858751Z     fn()
2025-12-04T11:58:24.4858904Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4858948Z     method(*args, **kwargs)
2025-12-04T11:58:24.4859101Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4859146Z     method(*args, **kwargs)
2025-12-04T11:58:24.4859295Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4859335Z     with policy():
2025-12-04T11:58:24.4859486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4859531Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4859953Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2462056448 and is now 3663724544.
2025-12-04T11:58:24.4859956Z 
2025-12-04T11:58:24.4860036Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4860319Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4860323Z 
2025-12-04T11:58:24.4860414Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4860416Z 
2025-12-04T11:58:24.4860418Z 
2025-12-04T11:58:24.4860495Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4860583Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4860838Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-6e9caf4cb6074b53.xml -
2025-12-04T11:58:24.4860899Z =========================== short test summary info ============================
2025-12-04T11:58:24.4861197Z FAILED [9.0131s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4861282Z Traceback (most recent call last):
2025-12-04T11:58:24.4861452Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4861495Z     getattr(self, test_name)()
2025-12-04T11:58:24.4861658Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4861693Z     fn()
2025-12-04T11:58:24.4861849Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4861891Z     method(*args, **kwargs)
2025-12-04T11:58:24.4862043Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4862082Z     method(*args, **kwargs)
2025-12-04T11:58:24.4862236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4862276Z     with policy():
2025-12-04T11:58:24.4862428Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4862469Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4862857Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2462056448 and is now 3663724544.
2025-12-04T11:58:24.4862860Z 
2025-12-04T11:58:24.4862938Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4863218Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4863222Z 
2025-12-04T11:58:24.4863309Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4863371Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4863437Z ======================= 1 failed, 7 deselected in 9.02s ========================
2025-12-04T11:58:24.4863475Z Got exit code 1
2025-12-04T11:58:24.4863518Z Retrying single test...
2025-12-04T11:58:24.4863727Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-74c155eb338d617d.xml
2025-12-04T11:58:24.4863806Z ============================= test session starts ==============================
2025-12-04T11:58:24.4863920Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4863961Z cachedir: .pytest_cache
2025-12-04T11:58:24.4864121Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4864169Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4864211Z configfile: pytest.ini
2025-12-04T11:58:24.4864376Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4864448Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4864722Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4864767Z Running 1 items in this shard
2025-12-04T11:58:24.4864769Z 
2025-12-04T11:58:24.4865123Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda I1204 11:55:55.294000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 356203
2025-12-04T11:58:24.4865298Z I1204 11:55:55.295000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 356204
2025-12-04T11:58:24.4865451Z I1204 11:55:55.295000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 356205
2025-12-04T11:58:24.4865601Z I1204 11:55:55.296000 356134 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 356206
2025-12-04T11:58:24.4866104Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4866168Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4866656Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4866718Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4867204Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4867266Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4867748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4867806Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4867951Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4868132Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4868459Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4868616Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4868902Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4869027Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4869305Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4869453Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4869764Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4869911Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4870189Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4870326Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4870605Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4870755Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4871275Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4871390Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4871586Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4871994Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4872109Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4872352Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4872517Z [rank1]:E1204 11:56:02.484000 356204 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4872558Z dist init r=1, world=4
2025-12-04T11:58:24.4872696Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4872859Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4873148Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4873303Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4873589Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4873715Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4874011Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4874158Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4874433Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4874581Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4874856Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4874993Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4875273Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4875422Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4875936Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4876052Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4876246Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4876679Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4876794Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4877004Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4877171Z [rank3]:E1204 11:56:02.502000 356206 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4877211Z dist init r=3, world=4
2025-12-04T11:58:24.4877348Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4877509Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4877797Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4877950Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4878343Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4878467Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4878742Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4878892Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4879167Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4879316Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4879590Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4879728Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4880008Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4880156Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4880672Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4880786Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4881016Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4881421Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4881537Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4881748Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4881911Z [rank0]:E1204 11:56:02.577000 356203 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4881951Z dist init r=0, world=4
2025-12-04T11:58:24.4882091Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4882252Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4882568Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4882724Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4883008Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4883133Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4883408Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4883557Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4883833Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4883978Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4884254Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4884391Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4884670Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4884819Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4885348Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4885463Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4885659Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4886065Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4886178Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4886391Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4886555Z [rank2]:E1204 11:56:02.594000 356205 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4886614Z dist init r=2, world=4
2025-12-04T11:58:24.4886955Z [rank0]:[W1204 11:56:03.658216290 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4886995Z FAILED [9.0134s] [100%]
2025-12-04T11:58:24.4886997Z 
2025-12-04T11:58:24.4887053Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4887190Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda _
2025-12-04T11:58:24.4887239Z Traceback (most recent call last):
2025-12-04T11:58:24.4887402Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4887448Z     self._join_processes(fn)
2025-12-04T11:58:24.4887621Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4887678Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4887856Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4887900Z     raise RuntimeError(error)
2025-12-04T11:58:24.4887982Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4888028Z Traceback (most recent call last):
2025-12-04T11:58:24.4888232Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4888275Z     getattr(self, test_name)()
2025-12-04T11:58:24.4888435Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4888470Z     fn()
2025-12-04T11:58:24.4888624Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4888665Z     method(*args, **kwargs)
2025-12-04T11:58:24.4888816Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4888857Z     method(*args, **kwargs)
2025-12-04T11:58:24.4889008Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4889045Z     with policy():
2025-12-04T11:58:24.4889231Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4889272Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4889664Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4889668Z 
2025-12-04T11:58:24.4889743Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4890023Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4890025Z 
2025-12-04T11:58:24.4890114Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4890117Z 
2025-12-04T11:58:24.4890118Z 
2025-12-04T11:58:24.4890195Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4890283Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4890533Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-74c155eb338d617d.xml -
2025-12-04T11:58:24.4890619Z =========================== short test summary info ============================
2025-12-04T11:58:24.4890911Z FAILED [9.0134s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4890958Z Traceback (most recent call last):
2025-12-04T11:58:24.4891122Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4891168Z     getattr(self, test_name)()
2025-12-04T11:58:24.4891329Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4891365Z     fn()
2025-12-04T11:58:24.4891516Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4891559Z     method(*args, **kwargs)
2025-12-04T11:58:24.4891710Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4891749Z     method(*args, **kwargs)
2025-12-04T11:58:24.4891900Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4891936Z     with policy():
2025-12-04T11:58:24.4892087Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4892130Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4892518Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4892522Z 
2025-12-04T11:58:24.4892597Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4892877Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4892879Z 
2025-12-04T11:58:24.4892966Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4893030Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4893116Z ======================= 1 failed, 7 deselected in 9.02s ========================
2025-12-04T11:58:24.4893156Z Got exit code 1
2025-12-04T11:58:24.4893382Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4893515Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.4893726Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-80c8d1bffe0a078c.xml
2025-12-04T11:58:24.4893783Z ============================= test session starts ==============================
2025-12-04T11:58:24.4893898Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4893942Z cachedir: .pytest_cache
2025-12-04T11:58:24.4894105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4894152Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4894193Z configfile: pytest.ini
2025-12-04T11:58:24.4894354Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4894447Z collecting ... collected 8 items / 4 deselected / 4 selected
2025-12-04T11:58:24.4894500Z stepcurrent: skipping 4 already run items.
2025-12-04T11:58:24.4894546Z Running 4 items in this shard
2025-12-04T11:58:24.4894548Z 
2025-12-04T11:58:24.4894902Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:56:07.019000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 356605
2025-12-04T11:58:24.4895059Z I1204 11:56:07.020000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 356606
2025-12-04T11:58:24.4895211Z I1204 11:56:07.020000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 356607
2025-12-04T11:58:24.4895363Z I1204 11:56:07.021000 356536 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 356608
2025-12-04T11:58:24.4895866Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4895928Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4896422Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4896483Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4896969Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4897028Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4897527Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4897589Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4897732Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4897897Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4898222Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4898381Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4898668Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4898831Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4899108Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4899256Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4899534Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4899681Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4899961Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4900098Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4900380Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4900532Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4901048Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4901166Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4901361Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4901797Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4901912Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4902125Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4902289Z [rank1]:E1204 11:56:14.165000 356606 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4902329Z dist init r=1, world=4
2025-12-04T11:58:24.4902467Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4902627Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4902916Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4903090Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4903375Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4903500Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4903776Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4903924Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4904198Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4904347Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4904620Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4904758Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4905039Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4905188Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4905703Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4905817Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4906030Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4906438Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4906553Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4906764Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4906930Z [rank2]:E1204 11:56:14.175000 356607 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4906970Z dist init r=2, world=4
2025-12-04T11:58:24.4907106Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4907267Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4907581Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4907736Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4908023Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4908187Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4908464Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4908612Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4908888Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4909034Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4909310Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4909448Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4909727Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4909876Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4910418Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4910535Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4910730Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4911136Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4911252Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4911461Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4911626Z [rank3]:E1204 11:56:14.180000 356608 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4911696Z dist init r=3, world=4
2025-12-04T11:58:24.4911835Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4911995Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4912285Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4912438Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4912722Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4912848Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4913123Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4913271Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4913547Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4913694Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4913969Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4914106Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4914404Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4914552Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4915067Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4915182Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4915380Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4915787Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4915926Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4916137Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4916301Z [rank0]:E1204 11:56:14.259000 356605 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4916341Z dist init r=0, world=4
2025-12-04T11:58:24.4916681Z [rank0]:[W1204 11:56:14.214690072 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4916722Z FAILED [9.1141s] [ 25%]
2025-12-04T11:58:24.4916724Z 
2025-12-04T11:58:24.4916780Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4916919Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _
2025-12-04T11:58:24.4916965Z Traceback (most recent call last):
2025-12-04T11:58:24.4917130Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4917173Z     self._join_processes(fn)
2025-12-04T11:58:24.4917348Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4917403Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4917582Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4917627Z     raise RuntimeError(error)
2025-12-04T11:58:24.4917708Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4917755Z Traceback (most recent call last):
2025-12-04T11:58:24.4917916Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4917959Z     getattr(self, test_name)()
2025-12-04T11:58:24.4918116Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4918185Z     fn()
2025-12-04T11:58:24.4918336Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4918378Z     method(*args, **kwargs)
2025-12-04T11:58:24.4918559Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4918600Z     method(*args, **kwargs)
2025-12-04T11:58:24.4918751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4918791Z     with policy():
2025-12-04T11:58:24.4918942Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4918984Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4919371Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4919375Z 
2025-12-04T11:58:24.4919452Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4919733Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4919767Z 
2025-12-04T11:58:24.4919855Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4919857Z 
2025-12-04T11:58:24.4919859Z 
2025-12-04T11:58:24.4919934Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4920022Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4920273Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-80c8d1bffe0a078c.xml -
2025-12-04T11:58:24.4920333Z =========================== short test summary info ============================
2025-12-04T11:58:24.4920626Z FAILED [9.1141s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.4920672Z Traceback (most recent call last):
2025-12-04T11:58:24.4920838Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4920881Z     getattr(self, test_name)()
2025-12-04T11:58:24.4921040Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4921075Z     fn()
2025-12-04T11:58:24.4921225Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4921266Z     method(*args, **kwargs)
2025-12-04T11:58:24.4921418Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4921458Z     method(*args, **kwargs)
2025-12-04T11:58:24.4921607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4921646Z     with policy():
2025-12-04T11:58:24.4921797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4921838Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4922227Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4922229Z 
2025-12-04T11:58:24.4922304Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4922603Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4922606Z 
2025-12-04T11:58:24.4922693Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4922759Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4922820Z ======================= 1 failed, 4 deselected in 9.12s ========================
2025-12-04T11:58:24.4922857Z Got exit code 1
2025-12-04T11:58:24.4922897Z Retrying single test...
2025-12-04T11:58:24.4923104Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-747b209bb437d791.xml
2025-12-04T11:58:24.4923161Z ============================= test session starts ==============================
2025-12-04T11:58:24.4923278Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4923319Z cachedir: .pytest_cache
2025-12-04T11:58:24.4923480Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4923555Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4923597Z configfile: pytest.ini
2025-12-04T11:58:24.4923759Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4923833Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4924104Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4924149Z Running 1 items in this shard
2025-12-04T11:58:24.4924151Z 
2025-12-04T11:58:24.4924506Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:56:18.639000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 357007
2025-12-04T11:58:24.4924662Z I1204 11:56:18.640000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 357008
2025-12-04T11:58:24.4924816Z I1204 11:56:18.640000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 357009
2025-12-04T11:58:24.4924966Z I1204 11:56:18.641000 356938 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 357010
2025-12-04T11:58:24.4925463Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4925525Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4926019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4926081Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4926584Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4926643Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4927126Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4927186Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4927329Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4927493Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4927789Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4927965Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4928304Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4928429Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4928708Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4928857Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4929135Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4929284Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4929557Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4929697Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4929974Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4930127Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4930646Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4930763Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4930991Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4931400Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4931516Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4931728Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4931895Z [rank0]:E1204 11:56:25.737000 357007 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4931934Z dist init r=0, world=4
2025-12-04T11:58:24.4932074Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4932233Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4932553Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4932708Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4932993Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4933117Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4933392Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4933542Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4933816Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4933963Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4934241Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4934377Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4934659Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4934808Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4935340Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4935456Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4935653Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4936059Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4936174Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4936385Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4936549Z [rank2]:E1204 11:56:25.740000 357009 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4936612Z dist init r=2, world=4
2025-12-04T11:58:24.4936750Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4936912Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4937200Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4937355Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4937639Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4937764Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4938041Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4938221Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4938499Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4938645Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4938925Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4939063Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4939338Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4939526Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4940038Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4940155Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4940351Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4940758Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4940905Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4941116Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4941281Z [rank3]:E1204 11:56:25.744000 357010 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4941321Z dist init r=3, world=4
2025-12-04T11:58:24.4941459Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4941619Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4941906Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4942061Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4942344Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4942468Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4942743Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4942891Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4943167Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4943314Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4943592Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4943747Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4944026Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4944175Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4944689Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4944804Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4945000Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4945425Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4945538Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4945750Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4945916Z [rank1]:E1204 11:56:25.785000 357008 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4945956Z dist init r=1, world=4
2025-12-04T11:58:24.4946292Z [rank0]:[W1204 11:56:25.574120505 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4946335Z FAILED [9.0137s] [100%]
2025-12-04T11:58:24.4946338Z 
2025-12-04T11:58:24.4946394Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4946532Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _
2025-12-04T11:58:24.4946579Z Traceback (most recent call last):
2025-12-04T11:58:24.4946744Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4946789Z     self._join_processes(fn)
2025-12-04T11:58:24.4946963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4947018Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4947198Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4947243Z     raise RuntimeError(error)
2025-12-04T11:58:24.4947323Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4947369Z Traceback (most recent call last):
2025-12-04T11:58:24.4947530Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4947573Z     getattr(self, test_name)()
2025-12-04T11:58:24.4947749Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4947785Z     fn()
2025-12-04T11:58:24.4947937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4947980Z     method(*args, **kwargs)
2025-12-04T11:58:24.4948131Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4948212Z     method(*args, **kwargs)
2025-12-04T11:58:24.4948362Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4948399Z     with policy():
2025-12-04T11:58:24.4948551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4948594Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4948982Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4949466Z 
2025-12-04T11:58:24.4949542Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4949824Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4949827Z 
2025-12-04T11:58:24.4949913Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4949916Z 
2025-12-04T11:58:24.4949918Z 
2025-12-04T11:58:24.4949993Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4950082Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4950333Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-747b209bb437d791.xml -
2025-12-04T11:58:24.4950393Z =========================== short test summary info ============================
2025-12-04T11:58:24.4950693Z FAILED [9.0137s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4950739Z Traceback (most recent call last):
2025-12-04T11:58:24.4950904Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4950947Z     getattr(self, test_name)()
2025-12-04T11:58:24.4951107Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4951143Z     fn()
2025-12-04T11:58:24.4951294Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4951335Z     method(*args, **kwargs)
2025-12-04T11:58:24.4951485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4951526Z     method(*args, **kwargs)
2025-12-04T11:58:24.4951676Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4951715Z     with policy():
2025-12-04T11:58:24.4951866Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4951908Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4952328Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4952330Z 
2025-12-04T11:58:24.4952406Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4952688Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4952690Z 
2025-12-04T11:58:24.4952776Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4952840Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4952902Z ======================= 1 failed, 7 deselected in 9.02s ========================
2025-12-04T11:58:24.4952940Z Got exit code 1
2025-12-04T11:58:24.4952981Z Retrying single test...
2025-12-04T11:58:24.4953191Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-c8863613f03f4d7f.xml
2025-12-04T11:58:24.4953248Z ============================= test session starts ==============================
2025-12-04T11:58:24.4953382Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4953423Z cachedir: .pytest_cache
2025-12-04T11:58:24.4953582Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4953629Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4953670Z configfile: pytest.ini
2025-12-04T11:58:24.4953834Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4953908Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.4954181Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4954227Z Running 1 items in this shard
2025-12-04T11:58:24.4954230Z 
2025-12-04T11:58:24.4954583Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda I1204 11:56:30.136000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 357409
2025-12-04T11:58:24.4954738Z I1204 11:56:30.136000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 357410
2025-12-04T11:58:24.4954890Z I1204 11:56:30.137000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 357411
2025-12-04T11:58:24.4955043Z I1204 11:56:30.137000 357340 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 357412
2025-12-04T11:58:24.4955542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4955605Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4956093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4956180Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4956663Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4956724Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4957205Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4957264Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4957408Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4957573Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4957884Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4958038Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4958360Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4958485Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4958762Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4958910Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4959187Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4959334Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4959611Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4959748Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4960029Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4960178Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4960727Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4960844Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4961041Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4961449Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4961563Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4961776Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4961940Z [rank0]:E1204 11:56:37.169000 357409 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4962002Z dist init r=0, world=4
2025-12-04T11:58:24.4962142Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4962302Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4962588Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4962743Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4963026Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4963153Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4963428Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4963576Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4963853Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4963999Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4964275Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4964412Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4964690Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4964856Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4965371Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4965488Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4965684Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4966092Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4966224Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4966436Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4966599Z [rank1]:E1204 11:56:37.191000 357410 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4966639Z dist init r=1, world=4
2025-12-04T11:58:24.4966780Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4966943Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4967230Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4967385Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4967670Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4967793Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4968071Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4968261Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4968539Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4968686Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4968963Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4969125Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4969402Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4969552Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4970063Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.4970180Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4970376Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4971040Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4971207Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4971532Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4971817Z [rank2]:E1204 11:56:37.192000 357411 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.4971874Z dist init r=2, world=4
2025-12-04T11:58:24.4972090Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4972336Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4972802Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4973036Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4973462Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4973661Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4974092Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4974317Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4974742Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4975042Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4975497Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4975711Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4976076Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4976225Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4977005Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.4977150Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4977385Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4977812Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4977928Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4978237Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4978461Z [rank3]:E1204 11:56:37.203000 357412 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.4978503Z dist init r=3, world=4
2025-12-04T11:58:24.4978843Z [rank0]:[W1204 11:56:37.004034589 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.4978895Z FAILED [8.8128s] [100%]
2025-12-04T11:58:24.4978899Z 
2025-12-04T11:58:24.4978975Z =================================== FAILURES ===================================
2025-12-04T11:58:24.4979113Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda _
2025-12-04T11:58:24.4979162Z Traceback (most recent call last):
2025-12-04T11:58:24.4979326Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.4979373Z     self._join_processes(fn)
2025-12-04T11:58:24.4979595Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.4979678Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.4979859Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.4979904Z     raise RuntimeError(error)
2025-12-04T11:58:24.4979985Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4980072Z Traceback (most recent call last):
2025-12-04T11:58:24.4980234Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4980278Z     getattr(self, test_name)()
2025-12-04T11:58:24.4980440Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4980477Z     fn()
2025-12-04T11:58:24.4980629Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4980671Z     method(*args, **kwargs)
2025-12-04T11:58:24.4980822Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4980864Z     method(*args, **kwargs)
2025-12-04T11:58:24.4981015Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4981055Z     with policy():
2025-12-04T11:58:24.4981206Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4981249Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4981673Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4981675Z 
2025-12-04T11:58:24.4981751Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4982031Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4982034Z 
2025-12-04T11:58:24.4982125Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4982127Z 
2025-12-04T11:58:24.4982129Z 
2025-12-04T11:58:24.4982206Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.4982297Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.4982550Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-c8863613f03f4d7f.xml -
2025-12-04T11:58:24.4982612Z =========================== short test summary info ============================
2025-12-04T11:58:24.4982905Z FAILED [8.8128s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.4982953Z Traceback (most recent call last):
2025-12-04T11:58:24.4983120Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4983164Z     getattr(self, test_name)()
2025-12-04T11:58:24.4983322Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4983359Z     fn()
2025-12-04T11:58:24.4983510Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4983551Z     method(*args, **kwargs)
2025-12-04T11:58:24.4983700Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4983741Z     method(*args, **kwargs)
2025-12-04T11:58:24.4983890Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4983948Z     with policy():
2025-12-04T11:58:24.4984099Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4984141Z     raise RuntimeError(msg)
2025-12-04T11:58:24.4984528Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4984533Z 
2025-12-04T11:58:24.4984608Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4984888Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4984891Z 
2025-12-04T11:58:24.4984979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4985043Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.4985105Z ======================= 1 failed, 7 deselected in 8.82s ========================
2025-12-04T11:58:24.4985170Z Got exit code 1
2025-12-04T11:58:24.4985399Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda
2025-12-04T11:58:24.4985531Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.4985741Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-fcbfdc75c93b62ce.xml
2025-12-04T11:58:24.4985799Z ============================= test session starts ==============================
2025-12-04T11:58:24.4985915Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.4985958Z cachedir: .pytest_cache
2025-12-04T11:58:24.4986117Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.4986166Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.4986206Z configfile: pytest.ini
2025-12-04T11:58:24.4986371Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.4986445Z collecting ... collected 8 items / 5 deselected / 3 selected
2025-12-04T11:58:24.4986497Z stepcurrent: skipping 5 already run items.
2025-12-04T11:58:24.4986542Z Running 3 items in this shard
2025-12-04T11:58:24.4986544Z 
2025-12-04T11:58:24.4986902Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:56:41.476000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 357811
2025-12-04T11:58:24.4987059Z I1204 11:56:41.477000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 357812
2025-12-04T11:58:24.4987210Z I1204 11:56:41.477000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 357813
2025-12-04T11:58:24.4987362Z I1204 11:56:41.478000 357742 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 357814
2025-12-04T11:58:24.4987869Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4987955Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4988522Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4988585Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4989072Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4989132Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4989614Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.4989722Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.4989866Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4990031Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4990326Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4990482Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4990769Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4990894Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4991173Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4991321Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4991597Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4991745Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4992020Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4992156Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4992472Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4992620Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4993137Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.4993253Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4993448Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4993857Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4993991Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4994203Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4994368Z [rank0]:E1204 11:56:48.492000 357811 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.4994407Z dist init r=0, world=4
2025-12-04T11:58:24.4994548Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4994708Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4994996Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4995151Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.4995434Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.4995560Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.4995836Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4995985Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4996259Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.4996405Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.4996698Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.4996836Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.4997115Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.4997265Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.4997780Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.4997895Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4998091Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.4998566Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.4998681Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.4998893Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.4999058Z [rank1]:E1204 11:56:48.498000 357812 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.4999098Z dist init r=1, world=4
2025-12-04T11:58:24.4999237Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.4999398Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.4999684Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.4999838Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5000122Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5000248Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5000524Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5000671Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5000986Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5001132Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5001407Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5001546Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5001824Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5001971Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5002485Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.5002652Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5002879Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5003288Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5003401Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5003612Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5003780Z [rank2]:E1204 11:56:48.503000 357813 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5003818Z dist init r=2, world=4
2025-12-04T11:58:24.5003958Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5004116Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5004404Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5004557Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5004842Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5004964Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5005240Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5005413Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5005687Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5005835Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5006113Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5006251Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5006528Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5006676Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5007207Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.5007321Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5007517Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5007924Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5008040Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5008301Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5008467Z [rank3]:E1204 11:56:48.514000 357814 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5008509Z dist init r=3, world=4
2025-12-04T11:58:24.5008847Z [rank0]:[W1204 11:56:48.316232342 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.5008889Z FAILED [8.8143s] [ 33%]
2025-12-04T11:58:24.5008891Z 
2025-12-04T11:58:24.5008948Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5009084Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _
2025-12-04T11:58:24.5009131Z Traceback (most recent call last):
2025-12-04T11:58:24.5009295Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5009339Z     self._join_processes(fn)
2025-12-04T11:58:24.5009554Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5009610Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5009790Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5009835Z     raise RuntimeError(error)
2025-12-04T11:58:24.5009917Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.5009963Z Traceback (most recent call last):
2025-12-04T11:58:24.5010125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5010168Z     getattr(self, test_name)()
2025-12-04T11:58:24.5010326Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5010361Z     fn()
2025-12-04T11:58:24.5010514Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5010556Z     method(*args, **kwargs)
2025-12-04T11:58:24.5010706Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5010780Z     method(*args, **kwargs)
2025-12-04T11:58:24.5010930Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5010968Z     with policy():
2025-12-04T11:58:24.5011119Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5011162Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5011550Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.5011553Z 
2025-12-04T11:58:24.5011635Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5011915Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5011919Z 
2025-12-04T11:58:24.5012008Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5012010Z 
2025-12-04T11:58:24.5012012Z 
2025-12-04T11:58:24.5012087Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5012176Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5012431Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-fcbfdc75c93b62ce.xml -
2025-12-04T11:58:24.5012492Z =========================== short test summary info ============================
2025-12-04T11:58:24.5012786Z FAILED [8.8143s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.5012834Z Traceback (most recent call last):
2025-12-04T11:58:24.5013000Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5013043Z     getattr(self, test_name)()
2025-12-04T11:58:24.5013205Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5013260Z     fn()
2025-12-04T11:58:24.5013454Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5013495Z     method(*args, **kwargs)
2025-12-04T11:58:24.5013646Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5013686Z     method(*args, **kwargs)
2025-12-04T11:58:24.5013839Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5013876Z     with policy():
2025-12-04T11:58:24.5014029Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5014069Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5014460Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.5014462Z 
2025-12-04T11:58:24.5014537Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5014817Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5014840Z 
2025-12-04T11:58:24.5014929Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5014991Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5015055Z ======================= 1 failed, 5 deselected in 8.83s ========================
2025-12-04T11:58:24.5015092Z Got exit code 1
2025-12-04T11:58:24.5015133Z Retrying single test...
2025-12-04T11:58:24.5015343Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-681abe36e7aff16a.xml
2025-12-04T11:58:24.5015401Z ============================= test session starts ==============================
2025-12-04T11:58:24.5015514Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5015558Z cachedir: .pytest_cache
2025-12-04T11:58:24.5015717Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5015766Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5015806Z configfile: pytest.ini
2025-12-04T11:58:24.5015970Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5016043Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5016317Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5016362Z Running 1 items in this shard
2025-12-04T11:58:24.5016364Z 
2025-12-04T11:58:24.5016715Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:56:52.889000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 358213
2025-12-04T11:58:24.5016870Z I1204 11:56:52.890000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 358214
2025-12-04T11:58:24.5017022Z I1204 11:56:52.890000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 358215
2025-12-04T11:58:24.5017174Z I1204 11:56:52.891000 358144 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 358216
2025-12-04T11:58:24.5017690Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5017754Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5018278Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5018338Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5018827Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5018929Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5019415Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5019473Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5019619Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5019783Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5020076Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5020235Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5020522Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5020648Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5020928Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5021077Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5021353Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5021499Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5021810Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5021947Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5022226Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5022378Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5022895Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.5023011Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5023206Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5023633Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5023749Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5023963Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5024128Z [rank1]:E1204 11:57:00.029000 358214 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5024167Z dist init r=1, world=4
2025-12-04T11:58:24.5024307Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5024466Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5024752Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5024905Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5025192Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5025316Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5025595Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5025741Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5026037Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5026186Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5026462Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5026601Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5026877Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5027026Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5027541Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.5027675Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5027870Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5028331Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5028447Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5028660Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5028826Z [rank2]:E1204 11:57:00.035000 358215 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5028867Z dist init r=2, world=4
2025-12-04T11:58:24.5029004Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5029163Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5029451Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5029605Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5029890Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5030015Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5030291Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5030469Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5030746Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5030894Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5031168Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5031303Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5031582Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5031730Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5032282Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.5032395Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5032591Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5033001Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5033117Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5033328Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5033492Z [rank0]:E1204 11:57:00.069000 358213 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5033531Z dist init r=0, world=4
2025-12-04T11:58:24.5033670Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5033830Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5034118Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5034271Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5034555Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5034698Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5034974Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5035123Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5035400Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5035548Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5035825Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5035961Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5036264Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5036413Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5036926Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3454009344.
2025-12-04T11:58:24.5037039Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5037236Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5037644Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5037758Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5037971Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5038135Z [rank3]:E1204 11:57:00.085000 358216 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5038209Z dist init r=3, world=4
2025-12-04T11:58:24.5038546Z [rank0]:[W1204 11:57:00.977697371 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.5038587Z FAILED [9.2133s] [100%]
2025-12-04T11:58:24.5038589Z 
2025-12-04T11:58:24.5038646Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5038782Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _
2025-12-04T11:58:24.5038859Z Traceback (most recent call last):
2025-12-04T11:58:24.5039023Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5039068Z     self._join_processes(fn)
2025-12-04T11:58:24.5039245Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5039299Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5039478Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5039521Z     raise RuntimeError(error)
2025-12-04T11:58:24.5039603Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.5039649Z Traceback (most recent call last):
2025-12-04T11:58:24.5039814Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5039858Z     getattr(self, test_name)()
2025-12-04T11:58:24.5040016Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5040084Z     fn()
2025-12-04T11:58:24.5040235Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5040277Z     method(*args, **kwargs)
2025-12-04T11:58:24.5040428Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5040469Z     method(*args, **kwargs)
2025-12-04T11:58:24.5040618Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5040656Z     with policy():
2025-12-04T11:58:24.5040810Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5040852Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5041239Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.5041243Z 
2025-12-04T11:58:24.5041319Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5041599Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5041602Z 
2025-12-04T11:58:24.5041690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5041692Z 
2025-12-04T11:58:24.5041754Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5041800Z Traceback (most recent call last):
2025-12-04T11:58:24.5041963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5042006Z     getattr(self, test_name)()
2025-12-04T11:58:24.5042166Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5042200Z     fn()
2025-12-04T11:58:24.5042352Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5042392Z     method(*args, **kwargs)
2025-12-04T11:58:24.5042542Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5042582Z     method(*args, **kwargs)
2025-12-04T11:58:24.5042749Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5042786Z     with policy():
2025-12-04T11:58:24.5042937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5042980Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5043365Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.5043368Z 
2025-12-04T11:58:24.5043442Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5043721Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5043724Z 
2025-12-04T11:58:24.5043811Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5043813Z 
2025-12-04T11:58:24.5043815Z 
2025-12-04T11:58:24.5043890Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5044003Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5044253Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-681abe36e7aff16a.xml -
2025-12-04T11:58:24.5044316Z =========================== short test summary info ============================
2025-12-04T11:58:24.5044610Z FAILED [9.2133s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.5044661Z Traceback (most recent call last):
2025-12-04T11:58:24.5044824Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5044869Z     getattr(self, test_name)()
2025-12-04T11:58:24.5045031Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5045066Z     fn()
2025-12-04T11:58:24.5045218Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5045258Z     method(*args, **kwargs)
2025-12-04T11:58:24.5045410Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5045450Z     method(*args, **kwargs)
2025-12-04T11:58:24.5045604Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5045641Z     with policy():
2025-12-04T11:58:24.5045794Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5045835Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5046224Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.5046226Z 
2025-12-04T11:58:24.5046300Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5046579Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5046599Z 
2025-12-04T11:58:24.5046690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5046692Z 
2025-12-04T11:58:24.5046752Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5046800Z Traceback (most recent call last):
2025-12-04T11:58:24.5046965Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5047009Z     getattr(self, test_name)()
2025-12-04T11:58:24.5047169Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5047207Z     fn()
2025-12-04T11:58:24.5047358Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5047400Z     method(*args, **kwargs)
2025-12-04T11:58:24.5047551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5047594Z     method(*args, **kwargs)
2025-12-04T11:58:24.5047746Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5050838Z     with policy():
2025-12-04T11:58:24.5051003Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5051045Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5051438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.5051440Z 
2025-12-04T11:58:24.5051515Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5051800Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5051803Z 
2025-12-04T11:58:24.5051892Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5051962Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5052026Z ======================= 1 failed, 7 deselected in 9.22s ========================
2025-12-04T11:58:24.5052064Z Got exit code 1
2025-12-04T11:58:24.5052106Z Retrying single test...
2025-12-04T11:58:24.5052316Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-993e4ba5ce1d3537.xml
2025-12-04T11:58:24.5052375Z ============================= test session starts ==============================
2025-12-04T11:58:24.5052491Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5052534Z cachedir: .pytest_cache
2025-12-04T11:58:24.5052693Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5052743Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5052783Z configfile: pytest.ini
2025-12-04T11:58:24.5052949Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5053022Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5053299Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5053343Z Running 1 items in this shard
2025-12-04T11:58:24.5053346Z 
2025-12-04T11:58:24.5053759Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda I1204 11:57:04.434000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 358615
2025-12-04T11:58:24.5053919Z I1204 11:57:04.435000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 358616
2025-12-04T11:58:24.5054072Z I1204 11:57:04.435000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 358617
2025-12-04T11:58:24.5054223Z I1204 11:57:04.436000 358546 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 358618
2025-12-04T11:58:24.5054725Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5054789Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5055276Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5055384Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5055871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5055930Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5056412Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5056469Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5056617Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5056784Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5057081Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5057239Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5057526Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5057655Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5057953Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5058104Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5058416Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5058566Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5058841Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5058982Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5059263Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5059447Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5059969Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.5060086Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5060283Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5060695Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5060812Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5061027Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5061193Z [rank3]:E1204 11:57:11.708000 358618 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5061233Z dist init r=3, world=4
2025-12-04T11:58:24.5061376Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5061536Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5061825Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5061978Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5062293Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5062418Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5062693Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5062841Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5063118Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5063265Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5063542Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5063698Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5063975Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5064122Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5064636Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3504340992.
2025-12-04T11:58:24.5064752Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5064949Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5065355Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5065472Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5065684Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5065851Z [rank2]:E1204 11:57:11.715000 358617 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5065892Z dist init r=2, world=4
2025-12-04T11:58:24.5066031Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5066193Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5066479Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5066652Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5066936Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5067064Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5067339Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5067487Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5067764Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5067910Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5068241Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5068377Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5068657Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5068804Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5069321Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3521118208.
2025-12-04T11:58:24.5069439Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5069635Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5070042Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5070157Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5070370Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5070536Z [rank1]:E1204 11:57:11.749000 358616 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5070575Z dist init r=1, world=4
2025-12-04T11:58:24.5070712Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5070912Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5071201Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5071357Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5071642Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5071766Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5072045Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5072191Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5072500Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5072646Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5072923Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5073060Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5073337Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5073487Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5073998Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3072 on device 0. CUDA driver allocated memory was 2459959296 and is now 3663724544.
2025-12-04T11:58:24.5074113Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5074308Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5074714Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5074829Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5075042Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5075227Z [rank0]:E1204 11:57:11.758000 358615 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5075266Z dist init r=0, world=4
2025-12-04T11:58:24.5075606Z [rank0]:[W1204 11:57:12.681914822 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T11:58:24.5075648Z FAILED [9.2146s] [100%]
2025-12-04T11:58:24.5075651Z 
2025-12-04T11:58:24.5075708Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5075845Z _ TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda _
2025-12-04T11:58:24.5075892Z Traceback (most recent call last):
2025-12-04T11:58:24.5076058Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5076102Z     self._join_processes(fn)
2025-12-04T11:58:24.5076278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5076352Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5076532Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5076576Z     raise RuntimeError(error)
2025-12-04T11:58:24.5076659Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.5076704Z Traceback (most recent call last):
2025-12-04T11:58:24.5076866Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5076909Z     getattr(self, test_name)()
2025-12-04T11:58:24.5077068Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5077102Z     fn()
2025-12-04T11:58:24.5077256Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5077298Z     method(*args, **kwargs)
2025-12-04T11:58:24.5077450Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5077490Z     method(*args, **kwargs)
2025-12-04T11:58:24.5077641Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5077678Z     with policy():
2025-12-04T11:58:24.5077831Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5077875Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5078308Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.5078312Z 
2025-12-04T11:58:24.5078389Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5078670Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5078673Z 
2025-12-04T11:58:24.5078761Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5078763Z 
2025-12-04T11:58:24.5078765Z 
2025-12-04T11:58:24.5078842Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5078963Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5079217Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-993e4ba5ce1d3537.xml -
2025-12-04T11:58:24.5079277Z =========================== short test summary info ============================
2025-12-04T11:58:24.5079571Z FAILED [9.2146s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T11:58:24.5079617Z Traceback (most recent call last):
2025-12-04T11:58:24.5079782Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5079825Z     getattr(self, test_name)()
2025-12-04T11:58:24.5079988Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5080022Z     fn()
2025-12-04T11:58:24.5080174Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5080214Z     method(*args, **kwargs)
2025-12-04T11:58:24.5080396Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5080435Z     method(*args, **kwargs)
2025-12-04T11:58:24.5080584Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5080621Z     with policy():
2025-12-04T11:58:24.5080773Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5080813Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5081207Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3454009344.
2025-12-04T11:58:24.5081209Z 
2025-12-04T11:58:24.5081284Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5081565Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5081567Z 
2025-12-04T11:58:24.5081654Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5081717Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5081780Z ======================= 1 failed, 7 deselected in 9.22s ========================
2025-12-04T11:58:24.5081817Z Got exit code 1
2025-12-04T11:58:24.5082050Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda
2025-12-04T11:58:24.5082179Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.5082389Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-21ec418542ac0484.xml
2025-12-04T11:58:24.5082446Z ============================= test session starts ==============================
2025-12-04T11:58:24.5082560Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5082601Z cachedir: .pytest_cache
2025-12-04T11:58:24.5082759Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5082806Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5082864Z configfile: pytest.ini
2025-12-04T11:58:24.5083029Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5083101Z collecting ... collected 8 items / 6 deselected / 2 selected
2025-12-04T11:58:24.5083157Z stepcurrent: skipping 6 already run items.
2025-12-04T11:58:24.5083202Z Running 2 items in this shard
2025-12-04T11:58:24.5083204Z 
2025-12-04T11:58:24.5083507Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:57:16.270000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359017
2025-12-04T11:58:24.5083661Z I1204 11:57:16.271000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359018
2025-12-04T11:58:24.5083817Z I1204 11:57:16.271000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359019
2025-12-04T11:58:24.5083970Z I1204 11:57:16.272000 358948 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359020
2025-12-04T11:58:24.5084473Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5084557Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5085047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5085107Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5085591Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5085652Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5086137Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5086196Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5086341Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5086506Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5086800Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5086954Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5087267Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5087392Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5087671Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5087821Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5088096Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5088285Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5088560Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5088733Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5089012Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5089162Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5089630Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5089747Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5089943Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5090296Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5090411Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5090623Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5090787Z [rank1]:E1204 11:57:23.719000 359018 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5090829Z dist init r=1, world=4
2025-12-04T11:58:24.5090967Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5091127Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5091417Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5091600Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5091883Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5092010Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5092283Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5092431Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5092710Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5092855Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5093152Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5093288Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5093568Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5093718Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5094180Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2462056448 and is now 3665821696.
2025-12-04T11:58:24.5094297Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5094492Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5094846Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5094958Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5095171Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5095336Z [rank0]:E1204 11:57:23.725000 359017 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5095376Z dist init r=0, world=4
2025-12-04T11:58:24.5095513Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5095694Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5095983Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5096137Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5096423Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5096546Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5096823Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5096970Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5097245Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5097412Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5097686Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5097824Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5098100Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5098289Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5098750Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5098866Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5099066Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5099417Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5099532Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5099743Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5099907Z [rank2]:E1204 11:57:23.757000 359019 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5099975Z dist init r=2, world=4
2025-12-04T11:58:24.5100115Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5100275Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5100564Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5100719Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5101003Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5101129Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5101407Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5101595Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5101869Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5102015Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5102291Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5102425Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5102703Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5102849Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5103315Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496.
2025-12-04T11:58:24.5103430Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5103628Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5103982Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5104094Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5104323Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5104487Z [rank3]:E1204 11:57:23.763000 359020 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5104528Z dist init r=3, world=4
2025-12-04T11:58:24.5104566Z FAILED [8.8123s] [ 50%]
2025-12-04T11:58:24.5104569Z 
2025-12-04T11:58:24.5104625Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5104722Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________
2025-12-04T11:58:24.5104769Z Traceback (most recent call last):
2025-12-04T11:58:24.5104933Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5104976Z     self._join_processes(fn)
2025-12-04T11:58:24.5105152Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5105207Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5105385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5105451Z     raise RuntimeError(error)
2025-12-04T11:58:24.5105532Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5105579Z Traceback (most recent call last):
2025-12-04T11:58:24.5105739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5105782Z     getattr(self, test_name)()
2025-12-04T11:58:24.5105942Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5105976Z     fn()
2025-12-04T11:58:24.5106129Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5106171Z     method(*args, **kwargs)
2025-12-04T11:58:24.5106320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5106362Z     method(*args, **kwargs)
2025-12-04T11:58:24.5106511Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5106548Z     with policy():
2025-12-04T11:58:24.5106699Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5106741Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5107078Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5107081Z 
2025-12-04T11:58:24.5107157Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5107383Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5107387Z 
2025-12-04T11:58:24.5107475Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5107477Z 
2025-12-04T11:58:24.5107478Z 
2025-12-04T11:58:24.5107555Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5107642Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5107891Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-21ec418542ac0484.xml -
2025-12-04T11:58:24.5107969Z =========================== short test summary info ============================
2025-12-04T11:58:24.5108248Z FAILED [8.8123s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5108296Z Traceback (most recent call last):
2025-12-04T11:58:24.5108462Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5108504Z     getattr(self, test_name)()
2025-12-04T11:58:24.5108663Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5108698Z     fn()
2025-12-04T11:58:24.5108848Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5108889Z     method(*args, **kwargs)
2025-12-04T11:58:24.5109039Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5109079Z     method(*args, **kwargs)
2025-12-04T11:58:24.5109227Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5109299Z     with policy():
2025-12-04T11:58:24.5109449Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5109490Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5109827Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5109829Z 
2025-12-04T11:58:24.5109903Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5110127Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5110130Z 
2025-12-04T11:58:24.5110217Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5110282Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5110345Z ======================= 1 failed, 6 deselected in 8.82s ========================
2025-12-04T11:58:24.5110381Z Got exit code 1
2025-12-04T11:58:24.5110422Z Retrying single test...
2025-12-04T11:58:24.5110629Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-a2b976bde10bee43.xml
2025-12-04T11:58:24.5110686Z ============================= test session starts ==============================
2025-12-04T11:58:24.5110801Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5110842Z cachedir: .pytest_cache
2025-12-04T11:58:24.5111005Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5111054Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5111098Z configfile: pytest.ini
2025-12-04T11:58:24.5111262Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5111337Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5111553Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5111599Z Running 1 items in this shard
2025-12-04T11:58:24.5111601Z 
2025-12-04T11:58:24.5111936Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:57:27.659000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359411
2025-12-04T11:58:24.5112094Z I1204 11:57:27.659000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359412
2025-12-04T11:58:24.5112247Z I1204 11:57:27.660000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359413
2025-12-04T11:58:24.5112398Z I1204 11:57:27.660000 359342 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359414
2025-12-04T11:58:24.5112899Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5112963Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5113454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5113533Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5114019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5114079Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5114564Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5114623Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5114767Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5114931Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5115221Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5115377Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5115668Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5115794Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5116072Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5116240Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5116518Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5116667Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5116942Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5117078Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5117357Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5117506Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5118001Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5118119Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5118350Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5118704Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5118821Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5119032Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5119197Z [rank2]:E1204 11:57:35.142000 359413 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5119236Z dist init r=2, world=4
2025-12-04T11:58:24.5119376Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5119534Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5119820Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5119975Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5120263Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5120388Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5120697Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5120845Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5121123Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5121270Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5121547Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5121684Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5121961Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5122142Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5122609Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5122725Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5122921Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5123274Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5123389Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5123602Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5123768Z [rank1]:E1204 11:57:35.164000 359412 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5123808Z dist init r=1, world=4
2025-12-04T11:58:24.5123945Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5124106Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5124391Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5124546Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5124852Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5124978Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5125255Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5125402Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5125677Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5125824Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5126099Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5126255Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5126533Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5126681Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5127148Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5127264Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5127459Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5127810Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5127924Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5128136Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5128353Z [rank0]:E1204 11:57:35.186000 359411 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5128393Z dist init r=0, world=4
2025-12-04T11:58:24.5128531Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5128690Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5129011Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5129165Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5129452Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5129578Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5129854Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5130002Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5130278Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5130426Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5130741Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5130876Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5131154Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5131302Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5131768Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496.
2025-12-04T11:58:24.5131884Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5132080Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5132431Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5132545Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5132757Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5132921Z [rank3]:E1204 11:57:35.187000 359414 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5132959Z dist init r=3, world=4
2025-12-04T11:58:24.5132998Z FAILED [8.9117s] [100%]
2025-12-04T11:58:24.5133000Z 
2025-12-04T11:58:24.5133059Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5133180Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________
2025-12-04T11:58:24.5133229Z Traceback (most recent call last):
2025-12-04T11:58:24.5133393Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5133438Z     self._join_processes(fn)
2025-12-04T11:58:24.5133610Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5133666Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5133845Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5133889Z     raise RuntimeError(error)
2025-12-04T11:58:24.5133970Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5134017Z Traceback (most recent call last):
2025-12-04T11:58:24.5134179Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5134222Z     getattr(self, test_name)()
2025-12-04T11:58:24.5134379Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5134437Z     fn()
2025-12-04T11:58:24.5134587Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5134630Z     method(*args, **kwargs)
2025-12-04T11:58:24.5134779Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5134820Z     method(*args, **kwargs)
2025-12-04T11:58:24.5134970Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5135009Z     with policy():
2025-12-04T11:58:24.5135163Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5135205Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5135541Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5135545Z 
2025-12-04T11:58:24.5135621Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5135846Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5135848Z 
2025-12-04T11:58:24.5135937Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5135939Z 
2025-12-04T11:58:24.5136001Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5136047Z Traceback (most recent call last):
2025-12-04T11:58:24.5136211Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5136253Z     getattr(self, test_name)()
2025-12-04T11:58:24.5136414Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5136448Z     fn()
2025-12-04T11:58:24.5136599Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5136639Z     method(*args, **kwargs)
2025-12-04T11:58:24.5136789Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5136829Z     method(*args, **kwargs)
2025-12-04T11:58:24.5136997Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5137034Z     with policy():
2025-12-04T11:58:24.5137186Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5137229Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5137564Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5137566Z 
2025-12-04T11:58:24.5137641Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5137864Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5137866Z 
2025-12-04T11:58:24.5137956Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5137958Z 
2025-12-04T11:58:24.5137960Z 
2025-12-04T11:58:24.5138036Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5138125Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5138441Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-a2b976bde10bee43.xml -
2025-12-04T11:58:24.5138503Z =========================== short test summary info ============================
2025-12-04T11:58:24.5138746Z FAILED [8.9117s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5138793Z Traceback (most recent call last):
2025-12-04T11:58:24.5138958Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5139000Z     getattr(self, test_name)()
2025-12-04T11:58:24.5139159Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5139196Z     fn()
2025-12-04T11:58:24.5139347Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5139386Z     method(*args, **kwargs)
2025-12-04T11:58:24.5139536Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5139576Z     method(*args, **kwargs)
2025-12-04T11:58:24.5139725Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5139762Z     with policy():
2025-12-04T11:58:24.5139916Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5139956Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5140294Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5140298Z 
2025-12-04T11:58:24.5140372Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5140596Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5140598Z 
2025-12-04T11:58:24.5140685Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5140687Z 
2025-12-04T11:58:24.5140746Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5140825Z Traceback (most recent call last):
2025-12-04T11:58:24.5140986Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5141028Z     getattr(self, test_name)()
2025-12-04T11:58:24.5141188Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5141223Z     fn()
2025-12-04T11:58:24.5141373Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5141413Z     method(*args, **kwargs)
2025-12-04T11:58:24.5141561Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5141601Z     method(*args, **kwargs)
2025-12-04T11:58:24.5141751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5141789Z     with policy():
2025-12-04T11:58:24.5141941Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5141982Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5142347Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5142351Z 
2025-12-04T11:58:24.5142423Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5142645Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5142647Z 
2025-12-04T11:58:24.5142734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5142798Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5142860Z ======================= 1 failed, 7 deselected in 8.92s ========================
2025-12-04T11:58:24.5142900Z Got exit code 1
2025-12-04T11:58:24.5142940Z Retrying single test...
2025-12-04T11:58:24.5143146Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-40549f4e9028d159.xml
2025-12-04T11:58:24.5143204Z ============================= test session starts ==============================
2025-12-04T11:58:24.5143317Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5143357Z cachedir: .pytest_cache
2025-12-04T11:58:24.5143516Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5143563Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5143604Z configfile: pytest.ini
2025-12-04T11:58:24.5143767Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5143840Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5144060Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5144105Z Running 1 items in this shard
2025-12-04T11:58:24.5144107Z 
2025-12-04T11:58:24.5144409Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda I1204 11:57:39.158000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 359805
2025-12-04T11:58:24.5144563Z I1204 11:57:39.159000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 359806
2025-12-04T11:58:24.5144735Z I1204 11:57:39.159000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 359807
2025-12-04T11:58:24.5144886Z I1204 11:57:39.161000 359736 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 359808
2025-12-04T11:58:24.5145385Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5145447Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5145937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5145997Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5146500Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5146560Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5147047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5147106Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5147252Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5147416Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5147707Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5147864Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5148192Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5148319Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5148597Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5148745Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5149062Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5149210Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5149485Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5149624Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5149903Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5150054Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5150519Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5150667Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5150865Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5151218Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5151333Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5151544Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5151711Z [rank2]:E1204 11:57:46.408000 359807 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5151750Z dist init r=2, world=4
2025-12-04T11:58:24.5151889Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5152050Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5152339Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5152494Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5152780Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5152905Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5153180Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5153346Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5153622Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5153771Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5154046Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5154181Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5154462Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5154610Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5155094Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5155210Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5155406Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5155758Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5155873Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5156085Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5156249Z [rank1]:E1204 11:57:46.414000 359806 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5156289Z dist init r=1, world=4
2025-12-04T11:58:24.5156427Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5156588Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5156875Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5157030Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5157315Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5157438Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5157732Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5157881Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5158191Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5158338Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5158613Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5158750Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5159029Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5159212Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5159674Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5159790Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5159985Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5160337Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5160451Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5160661Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5160827Z [rank0]:E1204 11:57:46.417000 359805 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5160865Z dist init r=0, world=4
2025-12-04T11:58:24.5161003Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5161164Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5161452Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5161606Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5161920Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5162044Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5162321Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5162469Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5162743Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5162892Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5163167Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5163328Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5163608Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5163756Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5164218Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 3. CUDA driver allocated memory was 2243952640 and is now 3456106496.
2025-12-04T11:58:24.5164334Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5164531Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5164883Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5164998Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5165209Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5165374Z [rank3]:E1204 11:57:46.463000 359808 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5165413Z dist init r=3, world=4
2025-12-04T11:58:24.5165451Z FAILED [8.5139s] [100%]
2025-12-04T11:58:24.5165453Z 
2025-12-04T11:58:24.5165511Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5165607Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda _________
2025-12-04T11:58:24.5165654Z Traceback (most recent call last):
2025-12-04T11:58:24.5165816Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5165879Z     self._join_processes(fn)
2025-12-04T11:58:24.5166052Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5166107Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5166287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5166331Z     raise RuntimeError(error)
2025-12-04T11:58:24.5166413Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5166458Z Traceback (most recent call last):
2025-12-04T11:58:24.5166619Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5166661Z     getattr(self, test_name)()
2025-12-04T11:58:24.5166821Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5166856Z     fn()
2025-12-04T11:58:24.5167008Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5167048Z     method(*args, **kwargs)
2025-12-04T11:58:24.5167219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5167259Z     method(*args, **kwargs)
2025-12-04T11:58:24.5167409Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5167446Z     with policy():
2025-12-04T11:58:24.5167599Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5167639Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5167979Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5167981Z 
2025-12-04T11:58:24.5168056Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5168335Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5168337Z 
2025-12-04T11:58:24.5168426Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5168429Z 
2025-12-04T11:58:24.5168430Z 
2025-12-04T11:58:24.5168505Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5168594Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5168844Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-40549f4e9028d159.xml -
2025-12-04T11:58:24.5168905Z =========================== short test summary info ============================
2025-12-04T11:58:24.5169146Z FAILED [8.5139s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5169195Z Traceback (most recent call last):
2025-12-04T11:58:24.5169357Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5169401Z     getattr(self, test_name)()
2025-12-04T11:58:24.5169558Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5169594Z     fn()
2025-12-04T11:58:24.5169783Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5169824Z     method(*args, **kwargs)
2025-12-04T11:58:24.5169973Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5170015Z     method(*args, **kwargs)
2025-12-04T11:58:24.5170164Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5170201Z     with policy():
2025-12-04T11:58:24.5170356Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5170396Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5170739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 3584 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5170742Z 
2025-12-04T11:58:24.5170814Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5171037Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5171071Z 
2025-12-04T11:58:24.5171158Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5171221Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5171283Z ======================= 1 failed, 7 deselected in 8.52s ========================
2025-12-04T11:58:24.5171321Z Got exit code 1
2025-12-04T11:58:24.5171493Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda
2025-12-04T11:58:24.5171623Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.5171830Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2e453835a51e706a.xml
2025-12-04T11:58:24.5171888Z ============================= test session starts ==============================
2025-12-04T11:58:24.5172002Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5172043Z cachedir: .pytest_cache
2025-12-04T11:58:24.5172200Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5172246Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5172287Z configfile: pytest.ini
2025-12-04T11:58:24.5172449Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5172521Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5172576Z stepcurrent: skipping 7 already run items.
2025-12-04T11:58:24.5172621Z Running 1 items in this shard
2025-12-04T11:58:24.5172623Z 
2025-12-04T11:58:24.5172926Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:57:50.492000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360199
2025-12-04T11:58:24.5173082Z I1204 11:57:50.493000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360200
2025-12-04T11:58:24.5173234Z I1204 11:57:50.494000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360201
2025-12-04T11:58:24.5173386Z I1204 11:57:50.494000 360130 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360202
2025-12-04T11:58:24.5173899Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5173963Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5174454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5174514Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5175005Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5175085Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5175569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5175627Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5175771Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5175936Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5176224Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5176381Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5176666Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5176792Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5177073Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5177222Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5177501Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5177649Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5177944Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5178081Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5178409Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5178560Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5179022Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5179140Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5179335Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5179727Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5179842Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5180053Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5180219Z [rank2]:E1204 11:57:57.692000 360201 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5180259Z dist init r=2, world=4
2025-12-04T11:58:24.5180397Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5180557Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5180843Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5180997Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5181283Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5181407Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5181686Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5181835Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5182109Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5182288Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5182563Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5182702Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5182979Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5183128Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5183591Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5183730Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5183927Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5184279Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5184395Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5184606Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5184769Z [rank0]:E1204 11:57:57.699000 360199 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5184811Z dist init r=0, world=4
2025-12-04T11:58:24.5184948Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5185107Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5185394Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5185549Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5185833Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5185959Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5186237Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5186384Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5186678Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5186825Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5187101Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5187236Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5187515Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5187663Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5188127Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5188301Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5188498Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5188852Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5188965Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5189177Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5189341Z [rank1]:E1204 11:57:57.734000 360200 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5189379Z dist init r=1, world=4
2025-12-04T11:58:24.5189516Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5189676Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5189962Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5190117Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5190401Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5190524Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5190834Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5190983Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5191259Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5191407Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5191681Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5191819Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5192094Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5192275Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5192735Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2243952640 and is now 3456106496.
2025-12-04T11:58:24.5192849Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5193047Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5193397Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5193514Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5193724Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5193888Z [rank3]:E1204 11:57:57.743000 360202 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5193927Z dist init r=3, world=4
2025-12-04T11:58:24.5193967Z FAILED [8.5130s] [100%]
2025-12-04T11:58:24.5193969Z 
2025-12-04T11:58:24.5194025Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5194122Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________
2025-12-04T11:58:24.5194170Z Traceback (most recent call last):
2025-12-04T11:58:24.5194332Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5194376Z     self._join_processes(fn)
2025-12-04T11:58:24.5194549Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5194603Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5194781Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5194854Z     raise RuntimeError(error)
2025-12-04T11:58:24.5194935Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.5194981Z Traceback (most recent call last):
2025-12-04T11:58:24.5195142Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5195187Z     getattr(self, test_name)()
2025-12-04T11:58:24.5195345Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5195380Z     fn()
2025-12-04T11:58:24.5195531Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5195572Z     method(*args, **kwargs)
2025-12-04T11:58:24.5195721Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5195763Z     method(*args, **kwargs)
2025-12-04T11:58:24.5195913Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5195950Z     with policy():
2025-12-04T11:58:24.5196101Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5196164Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5196499Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5196502Z 
2025-12-04T11:58:24.5196577Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5196801Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5196803Z 
2025-12-04T11:58:24.5196891Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5196893Z 
2025-12-04T11:58:24.5196895Z 
2025-12-04T11:58:24.5196971Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5197058Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5197305Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-2e453835a51e706a.xml -
2025-12-04T11:58:24.5197366Z =========================== short test summary info ============================
2025-12-04T11:58:24.5197611Z FAILED [8.5130s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T11:58:24.5197660Z Traceback (most recent call last):
2025-12-04T11:58:24.5197823Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5197866Z     getattr(self, test_name)()
2025-12-04T11:58:24.5198026Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5198061Z     fn()
2025-12-04T11:58:24.5198254Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5198296Z     method(*args, **kwargs)
2025-12-04T11:58:24.5198445Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5198486Z     method(*args, **kwargs)
2025-12-04T11:58:24.5198670Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5198708Z     with policy():
2025-12-04T11:58:24.5198859Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5198901Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5199238Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5199241Z 
2025-12-04T11:58:24.5199316Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5199538Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5199541Z 
2025-12-04T11:58:24.5199629Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5199693Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5199755Z ======================= 1 failed, 7 deselected in 8.52s ========================
2025-12-04T11:58:24.5199794Z Got exit code 1
2025-12-04T11:58:24.5199869Z Retrying single test...
2025-12-04T11:58:24.5200077Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e1a51fb950d40eb7.xml
2025-12-04T11:58:24.5200135Z ============================= test session starts ==============================
2025-12-04T11:58:24.5200248Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5200289Z cachedir: .pytest_cache
2025-12-04T11:58:24.5200447Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5200495Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5200536Z configfile: pytest.ini
2025-12-04T11:58:24.5200697Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5200769Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5200988Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5201034Z Running 1 items in this shard
2025-12-04T11:58:24.5201036Z 
2025-12-04T11:58:24.5201336Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:58:01.833000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360593
2025-12-04T11:58:24.5201490Z I1204 11:58:01.833000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360594
2025-12-04T11:58:24.5201645Z I1204 11:58:01.834000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360595
2025-12-04T11:58:24.5201794Z I1204 11:58:01.835000 360524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360596
2025-12-04T11:58:24.5202294Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5202355Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5202862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5202922Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5203407Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5203466Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5203950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5204007Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5204169Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5204332Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5204622Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5204778Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5205064Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5205190Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5205467Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5205614Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5205892Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5206038Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5206314Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5206453Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5206734Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5206901Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5207365Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5207483Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5207680Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5208033Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5208179Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5208390Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5208588Z [rank1]:E1204 11:58:09.005000 360594 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5208627Z dist init r=1, world=4
2025-12-04T11:58:24.5208765Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5208924Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5209213Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5209367Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5209653Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5209778Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5210052Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5210201Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5210475Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5210624Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5210897Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5211034Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5211353Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5211501Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5211964Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5212079Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5212276Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5212626Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5212761Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5212971Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5213135Z [rank0]:E1204 11:58:09.011000 360593 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5213272Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5213434Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5213722Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5213877Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5214160Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5214285Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5214562Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5214709Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5214986Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5215132Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5215425Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5215563Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5215843Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5215993Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5216454Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2243952640 and is now 3456106496.
2025-12-04T11:58:24.5216569Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5216764Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5217952Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5218066Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5218319Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5218485Z [rank3]:E1204 11:58:09.012000 360596 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5218524Z dist init r=0, world=4
2025-12-04T11:58:24.5218562Z dist init r=3, world=4
2025-12-04T11:58:24.5218699Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5218860Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5219146Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5219300Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5219585Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5219709Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5219986Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5220134Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5220410Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5220592Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5220866Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5221004Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5221284Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5221432Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5221895Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5222039Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5222235Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5222584Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5222700Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5222911Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5223076Z [rank2]:E1204 11:58:09.076000 360595 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5223114Z dist init r=2, world=4
2025-12-04T11:58:24.5223152Z FAILED [8.3134s] [100%]
2025-12-04T11:58:24.5223154Z 
2025-12-04T11:58:24.5223211Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5223308Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________
2025-12-04T11:58:24.5223355Z Traceback (most recent call last):
2025-12-04T11:58:24.5223517Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5223563Z     self._join_processes(fn)
2025-12-04T11:58:24.5223735Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5223790Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5223968Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5224013Z     raise RuntimeError(error)
2025-12-04T11:58:24.5224093Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5224139Z Traceback (most recent call last):
2025-12-04T11:58:24.5224300Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5224342Z     getattr(self, test_name)()
2025-12-04T11:58:24.5224522Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5224558Z     fn()
2025-12-04T11:58:24.5224709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5224751Z     method(*args, **kwargs)
2025-12-04T11:58:24.5224903Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5224942Z     method(*args, **kwargs)
2025-12-04T11:58:24.5225093Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5225129Z     with policy():
2025-12-04T11:58:24.5225282Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5225322Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5225663Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5225665Z 
2025-12-04T11:58:24.5225740Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5225991Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5225993Z 
2025-12-04T11:58:24.5226081Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5226083Z 
2025-12-04T11:58:24.5226084Z 
2025-12-04T11:58:24.5226160Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5226249Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5226499Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-e1a51fb950d40eb7.xml -
2025-12-04T11:58:24.5226560Z =========================== short test summary info ============================
2025-12-04T11:58:24.5226799Z FAILED [8.3134s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5226848Z Traceback (most recent call last):
2025-12-04T11:58:24.5227010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5227053Z     getattr(self, test_name)()
2025-12-04T11:58:24.5227211Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5227246Z     fn()
2025-12-04T11:58:24.5227398Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5227439Z     method(*args, **kwargs)
2025-12-04T11:58:24.5227587Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5227629Z     method(*args, **kwargs)
2025-12-04T11:58:24.5227779Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5227817Z     with policy():
2025-12-04T11:58:24.5227968Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5228010Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5228427Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5228431Z 
2025-12-04T11:58:24.5228504Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5228727Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5228730Z 
2025-12-04T11:58:24.5228817Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5228880Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5228941Z ======================= 1 failed, 7 deselected in 8.32s ========================
2025-12-04T11:58:24.5228979Z Got exit code 1
2025-12-04T11:58:24.5229019Z Retrying single test...
2025-12-04T11:58:24.5229225Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-94eafe77b3573973.xml
2025-12-04T11:58:24.5229283Z ============================= test session starts ==============================
2025-12-04T11:58:24.5229395Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5229436Z cachedir: .pytest_cache
2025-12-04T11:58:24.5229594Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5229679Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5229721Z configfile: pytest.ini
2025-12-04T11:58:24.5229883Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5229957Z collecting ... collected 8 items / 7 deselected / 1 selected
2025-12-04T11:58:24.5230176Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5230222Z Running 1 items in this shard
2025-12-04T11:58:24.5230226Z 
2025-12-04T11:58:24.5230528Z distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda I1204 11:58:12.936000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 360987
2025-12-04T11:58:24.5230684Z I1204 11:58:12.936000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 360988
2025-12-04T11:58:24.5230837Z I1204 11:58:12.937000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 360989
2025-12-04T11:58:24.5230987Z I1204 11:58:12.938000 360918 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 360990
2025-12-04T11:58:24.5231486Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5231547Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5232036Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5232096Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5232601Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5232661Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5233146Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T11:58:24.5233205Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T11:58:24.5233349Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5233511Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5233802Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5233979Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5234264Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5234389Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5234669Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5234818Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5235097Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5235244Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5235519Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5235657Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5235934Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5236087Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5236553Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5236669Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5236883Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5237239Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5237354Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5237565Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5237729Z [rank1]:E1204 11:58:20.050000 360988 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T11:58:24.5237770Z dist init r=1, world=4
2025-12-04T11:58:24.5237908Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5238067Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5238407Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5238562Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5238847Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5238972Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5239248Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5239398Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5239674Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5239821Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5240098Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5240233Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5240512Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5240659Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5241151Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5241267Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5241465Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5241819Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5241933Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5242146Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5242311Z [rank2]:E1204 11:58:20.053000 360989 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T11:58:24.5242390Z dist init r=2, world=4
2025-12-04T11:58:24.5242527Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5242687Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5242975Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5243129Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5243413Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5243538Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5243813Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5243962Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5244241Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5244389Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5244664Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5244801Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5245077Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5245246Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5245706Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 0. CUDA driver allocated memory was 2459959296 and is now 3665821696.
2025-12-04T11:58:24.5245822Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5246019Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5246372Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5246486Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5246696Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5246880Z [rank0]:E1204 11:58:20.095000 360987 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T11:58:24.5246918Z dist init r=0, world=4
2025-12-04T11:58:24.5247056Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T11:58:24.5247214Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T11:58:24.5247503Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5247657Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T11:58:24.5247943Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5248067Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T11:58:24.5248381Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5248531Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5248805Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5248953Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T11:58:24.5249228Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5249364Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T11:58:24.5249676Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5249824Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T11:58:24.5250286Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 3. CUDA driver allocated memory was 2250244096 and is now 3456106496.
2025-12-04T11:58:24.5250400Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5250598Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5250949Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5251096Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T11:58:24.5251308Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5251471Z [rank3]:E1204 11:58:20.104000 360990 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T11:58:24.5251510Z dist init r=3, world=4
2025-12-04T11:58:24.5251548Z FAILED [8.2126s] [100%]
2025-12-04T11:58:24.5251550Z 
2025-12-04T11:58:24.5251608Z =================================== FAILURES ===================================
2025-12-04T11:58:24.5251705Z ________ TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda _________
2025-12-04T11:58:24.5251752Z Traceback (most recent call last):
2025-12-04T11:58:24.5251915Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T11:58:24.5251960Z     self._join_processes(fn)
2025-12-04T11:58:24.5252133Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T11:58:24.5252188Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T11:58:24.5252365Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T11:58:24.5252409Z     raise RuntimeError(error)
2025-12-04T11:58:24.5252492Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5252537Z Traceback (most recent call last):
2025-12-04T11:58:24.5252700Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5252743Z     getattr(self, test_name)()
2025-12-04T11:58:24.5252902Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5252936Z     fn()
2025-12-04T11:58:24.5253088Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5253128Z     method(*args, **kwargs)
2025-12-04T11:58:24.5253278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5253318Z     method(*args, **kwargs)
2025-12-04T11:58:24.5253487Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5253524Z     with policy():
2025-12-04T11:58:24.5253675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5253717Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5254053Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5254056Z 
2025-12-04T11:58:24.5254131Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5254355Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5254357Z 
2025-12-04T11:58:24.5254447Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5254449Z 
2025-12-04T11:58:24.5254508Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5254555Z Traceback (most recent call last):
2025-12-04T11:58:24.5254739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5254782Z     getattr(self, test_name)()
2025-12-04T11:58:24.5254940Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5254975Z     fn()
2025-12-04T11:58:24.5255125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5255166Z     method(*args, **kwargs)
2025-12-04T11:58:24.5255317Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5255357Z     method(*args, **kwargs)
2025-12-04T11:58:24.5255506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5255543Z     with policy():
2025-12-04T11:58:24.5255695Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5255736Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5256071Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5256073Z 
2025-12-04T11:58:24.5256148Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5256371Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5256373Z 
2025-12-04T11:58:24.5256460Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5256462Z 
2025-12-04T11:58:24.5256466Z 
2025-12-04T11:58:24.5256541Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T11:58:24.5256628Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T11:58:24.5256876Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-94eafe77b3573973.xml -
2025-12-04T11:58:24.5256936Z =========================== short test summary info ============================
2025-12-04T11:58:24.5257203Z FAILED [8.2126s] distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T11:58:24.5257251Z Traceback (most recent call last):
2025-12-04T11:58:24.5257417Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5257458Z     getattr(self, test_name)()
2025-12-04T11:58:24.5257620Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5257654Z     fn()
2025-12-04T11:58:24.5257805Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5257846Z     method(*args, **kwargs)
2025-12-04T11:58:24.5257996Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5258035Z     method(*args, **kwargs)
2025-12-04T11:58:24.5258225Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5258263Z     with policy():
2025-12-04T11:58:24.5258414Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5258490Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5258827Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 1. CUDA driver allocated memory was 2317352960 and is now 3523215360.
2025-12-04T11:58:24.5258829Z 
2025-12-04T11:58:24.5258903Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5259125Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5259127Z 
2025-12-04T11:58:24.5259216Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5259218Z 
2025-12-04T11:58:24.5259276Z Process 2 exited with error code 10 and exception:
2025-12-04T11:58:24.5259323Z Traceback (most recent call last):
2025-12-04T11:58:24.5259486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T11:58:24.5259528Z     getattr(self, test_name)()
2025-12-04T11:58:24.5259688Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T11:58:24.5259722Z     fn()
2025-12-04T11:58:24.5259873Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5259912Z     method(*args, **kwargs)
2025-12-04T11:58:24.5260061Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T11:58:24.5260101Z     method(*args, **kwargs)
2025-12-04T11:58:24.5260251Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T11:58:24.5260287Z     with policy():
2025-12-04T11:58:24.5260438Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T11:58:24.5260481Z     raise RuntimeError(msg)
2025-12-04T11:58:24.5260816Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4096 on device 2. CUDA driver allocated memory was 2300575744 and is now 3506438144.
2025-12-04T11:58:24.5260818Z 
2025-12-04T11:58:24.5260890Z To execute this test, run the following from the base repo dir:
2025-12-04T11:58:24.5261142Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_exec_order.py TestFSDPExecOrderCUDA.test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5261144Z 
2025-12-04T11:58:24.5261231Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T11:58:24.5261294Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T11:58:24.5261357Z ======================= 1 failed, 7 deselected in 8.22s ========================
2025-12-04T11:58:24.5261395Z Got exit code 1
2025-12-04T11:58:24.5261569Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda
2025-12-04T11:58:24.5261697Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T11:58:24.5261903Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8433612606330e98.xml
2025-12-04T11:58:24.5261962Z ============================= test session starts ==============================
2025-12-04T11:58:24.5262074Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T11:58:24.5262115Z cachedir: .pytest_cache
2025-12-04T11:58:24.5262272Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T11:58:24.5262340Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T11:58:24.5262381Z configfile: pytest.ini
2025-12-04T11:58:24.5262542Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T11:58:24.5262614Z collecting ... collected 8 items / 8 deselected / 0 selected
2025-12-04T11:58:24.5262666Z stepcurrent: skipping 8 already run items.
2025-12-04T11:58:24.5262711Z Running 0 items in this shard
2025-12-04T11:58:24.5262713Z 
2025-12-04T11:58:24.5262959Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_exec_order/distributed.fsdp.test_fsdp_exec_order-8433612606330e98.xml -
2025-12-04T11:58:24.5263018Z ============================ 8 deselected in 0.00s =============================
2025-12-04T11:58:24.5264519Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_first_iter_order_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy0_iters_before_path_change_3_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_1_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_invalid_later_iter_order_sharding_strategy1_iters_before_path_change_3_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_exec_order.py::TestFSDPExecOrderCUDA::test_train_eval_sharding_strategy1_cuda']
2025-12-04T11:58:24.5264525Z 
2025-12-04T11:58:24.5264726Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_exec_order 1/1 (test/test-reports/distributed.fsdp.test_fsdp_exec_order_1.1_e994e873868c2dab_.log)
2025-12-04T11:58:24.5264728Z 
2025-12-04T11:58:24.5264859Z Finished distributed/fsdp/test_fsdp_exec_order 1/1 ... [2025-12-04 11:58:24.429614][2289003.078794959], took 4.36min
2025-12-04T11:58:24.5265122Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:58:24.5265227Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:58:24.5265324Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T11:58:24.5265372Z Uploading artifacts took 0.00 seconds
2025-12-04T11:58:24.5265433Z distributed/fsdp/test_fsdp_exec_order 1/1 failed!
2025-12-04T11:58:24.5265556Z Running distributed/fsdp/test_fsdp_flatten_params 1/1 ... [2025-12-04 11:58:24.432583][2289003.08176702]
2025-12-04T11:58:24.5265605Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:58:24.5265934Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_flatten_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:58:24.432769]
2025-12-04T11:59:27.4920090Z 
2025-12-04T11:59:27.4921132Z distributed/fsdp/test_fsdp_flatten_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_flatten_params_1.1_bf7ca175952f8a78_.log
2025-12-04T11:59:27.4926069Z Running 14 items in this shard: test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_empty_module, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_aligned_full_precision, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_aligned_mixed_precision, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_unaligned, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_with_memory_format_memory_format0, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flat_param_shard_metadata_with_memory_format_memory_format1, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_flatten_nothing, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_numel_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_numel_without_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_output_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_output_without_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_partial_flattening, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_pnorm_after_step_with_shared_params, test/distributed/fsdp/test_fsdp_flatten_params.py::TestFlattenParams::test_writeback_orig_params_no_shard
2025-12-04T11:59:27.4930878Z 
2025-12-04T11:59:27.4931121Z Finished distributed/fsdp/test_fsdp_flatten_params 1/1 ... [2025-12-04 11:59:27.491672][2289066.140851202], took 1.05min
2025-12-04T11:59:27.4935288Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T11:59:27.4952281Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T11:59:27.4955011Z Running distributed/test_distributed_spawn 3/7 ... [2025-12-04 11:59:27.495347][2289066.144531064]
2025-12-04T11:59:27.4956402Z MPI not available -- MPI backend tests will be skipped
2025-12-04T11:59:27.4956793Z Running distributed tests for the test backend with env init_method
2025-12-04T11:59:27.4958489Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:59:27.4960011Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:27.495834]
2025-12-04T11:59:29.5222749Z 
2025-12-04T11:59:29.5224020Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_a71a9c699ade0e28_.log
2025-12-04T11:59:29.5224391Z Running 0 items in this shard:
2025-12-04T11:59:29.5224479Z 
2025-12-04T11:59:29.5228826Z Running distributed tests for the test backend with file init_method
2025-12-04T11:59:29.5231631Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:59:29.5232667Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:29.523057]
2025-12-04T11:59:31.4789067Z 
2025-12-04T11:59:31.4790202Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_6b98ab0038112441_.log
2025-12-04T11:59:31.4790868Z Running 0 items in this shard:
2025-12-04T11:59:31.4791023Z 
2025-12-04T11:59:31.4795732Z Running distributed tests for the nccl backend with env init_method
2025-12-04T11:59:31.4798529Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T11:59:31.4799563Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 11:59:31.479709]
2025-12-04T12:02:39.4508102Z 
2025-12-04T12:02:39.4509129Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_a149f9d8bf39377a_.log
2025-12-04T12:02:39.4520767Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:02:39.4529682Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last
2025-12-04T12:02:39.4530245Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream
2025-12-04T12:02:39.4530738Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex
2025-12-04T12:02:39.4531232Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported
2025-12-04T12:02:39.4531723Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async
2025-12-04T12:02:39.4532171Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex
2025-12-04T12:02:39.4532651Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda
2025-12-04T12:02:39.4533168Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex
2025-12-04T12:02:39.4533689Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group
2025-12-04T12:02:39.4534204Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl
2025-12-04T12:02:39.4534679Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group
2025-12-04T12:02:39.4535128Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list
2025-12-04T12:02:39.4535585Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async
2025-12-04T12:02:39.4536110Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger
2025-12-04T12:02:39.4536664Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future
2025-12-04T12:02:39.4537220Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph
2025-12-04T12:02:39.4537676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD
2025-12-04T12:02:39.4538044Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference
2025-12-04T12:02:39.4538459Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks
2025-12-04T12:02:39.4538855Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler
2025-12-04T12:02:39.4539234Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features
2025-12-04T12:02:39.4539580Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend
2025-12-04T12:02:39.4539921Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler
2025-12-04T12:02:39.4540300Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup
2025-12-04T12:02:39.4540738Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather
2025-12-04T12:02:39.4541116Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream
2025-12-04T12:02:39.4541475Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups
2025-12-04T12:02:39.4541844Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module
2025-12-04T12:02:39.4542240Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity
2025-12-04T12:02:39.4542616Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min
2025-12-04T12:02:39.4542976Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product
2025-12-04T12:02:39.4543322Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min
2025-12-04T12:02:39.4543676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda
2025-12-04T12:02:39.4544053Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler
2025-12-04T12:02:39.4544416Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl
2025-12-04T12:02:39.4544788Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:02:39.4545008Z 
2025-12-04T12:02:39.4545098Z Running distributed tests for the nccl backend with file init_method
2025-12-04T12:02:39.4545277Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:02:39.4545717Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:02:39.452037]
2025-12-04T12:05:47.4333374Z 
2025-12-04T12:05:47.4334245Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_88b49a8a749fb92c_.log
2025-12-04T12:05:47.4345649Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:05:47.4354489Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last
2025-12-04T12:05:47.4355019Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream
2025-12-04T12:05:47.4355549Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex
2025-12-04T12:05:47.4356021Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported
2025-12-04T12:05:47.4356498Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async
2025-12-04T12:05:47.4356929Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex
2025-12-04T12:05:47.4357382Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda
2025-12-04T12:05:47.4357865Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex
2025-12-04T12:05:47.4358419Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group
2025-12-04T12:05:47.4358900Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl
2025-12-04T12:05:47.4359395Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group
2025-12-04T12:05:47.4359818Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list
2025-12-04T12:05:47.4360250Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async
2025-12-04T12:05:47.4360751Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger
2025-12-04T12:05:47.4361279Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future
2025-12-04T12:05:47.4361728Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph
2025-12-04T12:05:47.4362160Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD
2025-12-04T12:05:47.4362583Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference
2025-12-04T12:05:47.4363025Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks
2025-12-04T12:05:47.4363494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler
2025-12-04T12:05:47.4363891Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features
2025-12-04T12:05:47.4364230Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend
2025-12-04T12:05:47.4364564Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler
2025-12-04T12:05:47.4364940Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup
2025-12-04T12:05:47.4365318Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather
2025-12-04T12:05:47.4365685Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream
2025-12-04T12:05:47.4366033Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups
2025-12-04T12:05:47.4366439Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module
2025-12-04T12:05:47.4366825Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity
2025-12-04T12:05:47.4367195Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min
2025-12-04T12:05:47.4367547Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product
2025-12-04T12:05:47.4367882Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min
2025-12-04T12:05:47.4368397Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda
2025-12-04T12:05:47.4368769Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler
2025-12-04T12:05:47.4369122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl
2025-12-04T12:05:47.4369483Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:05:47.4369738Z 
2025-12-04T12:05:47.4369826Z Running distributed tests for the gloo backend with env init_method
2025-12-04T12:05:47.4369998Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:05:47.4370426Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:05:47.434600]
2025-12-04T12:08:24.4611049Z 
2025-12-04T12:08:24.4612231Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_8fc46078fe9fbf29_.log
2025-12-04T12:08:24.4624876Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:08:24.4633044Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last
2025-12-04T12:08:24.4633601Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream
2025-12-04T12:08:24.4634089Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex
2025-12-04T12:08:24.4634578Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported
2025-12-04T12:08:24.4635063Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async
2025-12-04T12:08:24.4635505Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex
2025-12-04T12:08:24.4635979Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda
2025-12-04T12:08:24.4636480Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex
2025-12-04T12:08:24.4636993Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group
2025-12-04T12:08:24.4637497Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl
2025-12-04T12:08:24.4637959Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group
2025-12-04T12:08:24.4638448Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list
2025-12-04T12:08:24.4638896Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async
2025-12-04T12:08:24.4639472Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger
2025-12-04T12:08:24.4640020Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future
2025-12-04T12:08:24.4640417Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph
2025-12-04T12:08:24.4640774Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD
2025-12-04T12:08:24.4641128Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference
2025-12-04T12:08:24.4641499Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks
2025-12-04T12:08:24.4641893Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler
2025-12-04T12:08:24.4642265Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features
2025-12-04T12:08:24.4642678Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend
2025-12-04T12:08:24.4643011Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler
2025-12-04T12:08:24.4643379Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup
2025-12-04T12:08:24.4643760Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather
2025-12-04T12:08:24.4644133Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream
2025-12-04T12:08:24.4644485Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups
2025-12-04T12:08:24.4644846Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module
2025-12-04T12:08:24.4645236Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity
2025-12-04T12:08:24.4645605Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min
2025-12-04T12:08:24.4645959Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product
2025-12-04T12:08:24.4646296Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min
2025-12-04T12:08:24.4646644Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda
2025-12-04T12:08:24.4647033Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler
2025-12-04T12:08:24.4647387Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl
2025-12-04T12:08:24.4647749Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:08:24.4647964Z 
2025-12-04T12:08:24.4648052Z Running distributed tests for the gloo backend with file init_method
2025-12-04T12:08:24.4648269Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:08:24.4648733Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=3', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:08:24.462411]
2025-12-04T12:11:02.7453618Z 
2025-12-04T12:11:02.7454229Z distributed/test_distributed_spawn 3/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_3.7_3934831f6ebb6547_.log
2025-12-04T12:11:02.7460535Z Running 36 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:11:02.7466218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_SyncBatchNorm_Channels_Last
2025-12-04T12:11:02.7466663Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_DistributedDataParallel_non_default_stream
2025-12-04T12:11:02.7467055Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_complex
2025-12-04T12:11:02.7467444Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_max_complex_unsupported
2025-12-04T12:11:02.7467832Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_sum_async
2025-12-04T12:11:02.7468218Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_cuda_complex
2025-12-04T12:11:02.7468594Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda
2025-12-04T12:11:02.7469041Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_cuda_complex
2025-12-04T12:11:02.7469448Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_full_group
2025-12-04T12:11:02.7469847Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_no_rank_zero_nccl
2025-12-04T12:11:02.7470219Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_full_group
2025-12-04T12:11:02.7470572Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_object_list
2025-12-04T12:11:02.7470928Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_coalescing_manager_async
2025-12-04T12:11:02.7471344Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_compute_bucket_assignment_by_size_sparse_error_without_logger
2025-12-04T12:11:02.7471783Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_buffer_hook_allreduce_return_future
2025-12-04T12:11:02.7472159Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_create_graph
2025-12-04T12:11:02.7472515Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_hook_pickling_powerSGD
2025-12-04T12:11:02.7472869Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_inference
2025-12-04T12:11:02.7473239Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_model_diff_num_params_across_ranks
2025-12-04T12:11:02.7473630Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_profiling_torch_profiler
2025-12-04T12:11:02.7474001Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_zero_output_features
2025-12-04T12:11:02.7474337Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend
2025-12-04T12:11:02.7474668Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_isend_torch_profiler
2025-12-04T12:11:02.7475040Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_monitored_barrier_gloo_subgroup
2025-12-04T12:11:02.7475455Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allgather
2025-12-04T12:11:02.7475822Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_high_priority_stream
2025-12-04T12:11:02.7476174Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups
2025-12-04T12:11:02.7476533Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_output_unused_in_loss_dict_module
2025-12-04T12:11:02.7476921Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_parity
2025-12-04T12:11:02.7477290Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_full_group_min
2025-12-04T12:11:02.7477644Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_product
﻿2025-12-04T12:11:02.7480494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_min
2025-12-04T12:11:02.7480841Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_scatter_tensor_cuda
2025-12-04T12:11:02.7481251Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_autograd_profiler
2025-12-04T12:11:02.7481602Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_send_recv_nccl
2025-12-04T12:11:02.7481963Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_skip_all_reduce_unused_parameters
2025-12-04T12:11:02.7482176Z 
2025-12-04T12:11:02.7482313Z Finished distributed/test_distributed_spawn 3/7 ... [2025-12-04 12:11:02.745869][2289761.395046316], took 11.59min
2025-12-04T12:11:02.7482753Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:11:02.7486883Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:11:02.7490669Z Running distributed/test_distributed_spawn 6/7 ... [2025-12-04 12:11:02.748959][2289761.398140065]
2025-12-04T12:11:02.7491132Z MPI not available -- MPI backend tests will be skipped
2025-12-04T12:11:02.7492025Z Running distributed tests for the test backend with env init_method
2025-12-04T12:11:02.7492905Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:11:02.7495027Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:11:02.749373]
2025-12-04T12:11:04.7062323Z 
2025-12-04T12:11:04.7063761Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_5a8ddf85e205a8da_.log
2025-12-04T12:11:04.7064760Z Running 0 items in this shard:
2025-12-04T12:11:04.7064990Z 
2025-12-04T12:11:04.7065287Z Running distributed tests for the test backend with file init_method
2025-12-04T12:11:04.7065785Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:11:04.7068501Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:11:04.706622]
2025-12-04T12:11:06.6567193Z 
2025-12-04T12:11:06.6569194Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_1a03a9b75b486076_.log
2025-12-04T12:11:06.6570130Z Running 0 items in this shard:
2025-12-04T12:11:06.6570365Z 
2025-12-04T12:11:06.6573219Z Running distributed tests for the nccl backend with env init_method
2025-12-04T12:11:06.6573943Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:11:06.6576482Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:11:06.657520]
2025-12-04T12:15:07.4004528Z 
2025-12-04T12:15:07.4005646Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_3d920147986b72d2_.log
2025-12-04T12:15:07.4022005Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:15:07.4032306Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager
2025-12-04T12:15:07.4032947Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager
2025-12-04T12:15:07.4033478Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class
2025-12-04T12:15:07.4033990Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex
2025-12-04T12:15:07.4034425Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group
2025-12-04T12:15:07.4034833Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda
2025-12-04T12:15:07.4035243Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup
2025-12-04T12:15:07.4035643Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min
2025-12-04T12:15:07.4036045Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product
2025-12-04T12:15:07.4036455Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product
2025-12-04T12:15:07.4036883Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda
2025-12-04T12:15:07.4037321Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split
2025-12-04T12:15:07.4037755Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group
2025-12-04T12:15:07.4038242Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group
2025-12-04T12:15:07.4038650Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group
2025-12-04T12:15:07.4039100Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group
2025-12-04T12:15:07.4039507Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err
2025-12-04T12:15:07.4039897Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda
2025-12-04T12:15:07.4040270Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group
2025-12-04T12:15:07.4040661Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook
2025-12-04T12:15:07.4041086Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping
2025-12-04T12:15:07.4041551Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad
2025-12-04T12:15:07.4042015Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks
2025-12-04T12:15:07.4042461Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook
2025-12-04T12:15:07.4042879Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs
2025-12-04T12:15:07.4043270Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized
2025-12-04T12:15:07.4043715Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none
2025-12-04T12:15:07.4044203Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params
2025-12-04T12:15:07.4044659Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none
2025-12-04T12:15:07.4045078Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd
2025-12-04T12:15:07.4045428Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone
2025-12-04T12:15:07.4045777Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states
2025-12-04T12:15:07.4046125Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs
2025-12-04T12:15:07.4046465Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks
2025-12-04T12:15:07.4046794Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda
2025-12-04T12:15:07.4047122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group
2025-12-04T12:15:07.4047462Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup
2025-12-04T12:15:07.4047827Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce
2025-12-04T12:15:07.4048283Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size
2025-12-04T12:15:07.4048708Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload
2025-12-04T12:15:07.4049106Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max
2025-12-04T12:15:07.4049445Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min
2025-12-04T12:15:07.4049798Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:15:07.4050002Z 
2025-12-04T12:15:07.4050093Z Running distributed tests for the nccl backend with file init_method
2025-12-04T12:15:07.4050263Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:15:07.4050693Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:15:07.401730]
2025-12-04T12:19:07.4006544Z 
2025-12-04T12:19:07.4007171Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_78c0cc8c2511fca6_.log
2025-12-04T12:19:07.4014806Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:19:07.4021657Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager
2025-12-04T12:19:07.4022111Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager
2025-12-04T12:19:07.4022489Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class
2025-12-04T12:19:07.4022855Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex
2025-12-04T12:19:07.4023224Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group
2025-12-04T12:19:07.4023597Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda
2025-12-04T12:19:07.4023972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup
2025-12-04T12:19:07.4024335Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min
2025-12-04T12:19:07.4024703Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product
2025-12-04T12:19:07.4025083Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product
2025-12-04T12:19:07.4025481Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda
2025-12-04T12:19:07.4025878Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split
2025-12-04T12:19:07.4026277Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group
2025-12-04T12:19:07.4026683Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group
2025-12-04T12:19:07.4027101Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group
2025-12-04T12:19:07.4027460Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group
2025-12-04T12:19:07.4027833Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err
2025-12-04T12:19:07.4028243Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda
2025-12-04T12:19:07.4028581Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group
2025-12-04T12:19:07.4028939Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook
2025-12-04T12:19:07.4029333Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping
2025-12-04T12:19:07.4029751Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad
2025-12-04T12:19:07.4030199Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks
2025-12-04T12:19:07.4030604Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook
2025-12-04T12:19:07.4030972Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs
2025-12-04T12:19:07.4031328Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized
2025-12-04T12:19:07.4031734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none
2025-12-04T12:19:07.4032182Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params
2025-12-04T12:19:07.4032626Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none
2025-12-04T12:19:07.4033042Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd
2025-12-04T12:19:07.4033389Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone
2025-12-04T12:19:07.4033735Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states
2025-12-04T12:19:07.4034084Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs
2025-12-04T12:19:07.4034421Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks
2025-12-04T12:19:07.4034753Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda
2025-12-04T12:19:07.4035082Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group
2025-12-04T12:19:07.4035422Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup
2025-12-04T12:19:07.4035785Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce
2025-12-04T12:19:07.4036199Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size
2025-12-04T12:19:07.4036656Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload
2025-12-04T12:19:07.4037025Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max
2025-12-04T12:19:07.4037386Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min
2025-12-04T12:19:07.4037737Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:19:07.4037940Z 
2025-12-04T12:19:07.4038027Z Running distributed tests for the gloo backend with env init_method
2025-12-04T12:19:07.4038242Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:19:07.4038674Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:19:07.401956]
2025-12-04T12:22:43.2467296Z 
2025-12-04T12:22:43.2467942Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_7af0010540e7ac65_.log
2025-12-04T12:22:43.2480303Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:22:43.2488441Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager
2025-12-04T12:22:43.2488898Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager
2025-12-04T12:22:43.2489280Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class
2025-12-04T12:22:43.2489647Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex
2025-12-04T12:22:43.2490017Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group
2025-12-04T12:22:43.2490386Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda
2025-12-04T12:22:43.2490763Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup
2025-12-04T12:22:43.2491149Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min
2025-12-04T12:22:43.2491516Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product
2025-12-04T12:22:43.2491890Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product
2025-12-04T12:22:43.2492280Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda
2025-12-04T12:22:43.2492676Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split
2025-12-04T12:22:43.2493122Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group
2025-12-04T12:22:43.2493530Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group
2025-12-04T12:22:43.2493899Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group
2025-12-04T12:22:43.2494254Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group
2025-12-04T12:22:43.2494624Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err
2025-12-04T12:22:43.2494977Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda
2025-12-04T12:22:43.2495312Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group
2025-12-04T12:22:43.2495670Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook
2025-12-04T12:22:43.2496086Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping
2025-12-04T12:22:43.2496522Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad
2025-12-04T12:22:43.2496943Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks
2025-12-04T12:22:43.2497328Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook
2025-12-04T12:22:43.2497694Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs
2025-12-04T12:22:43.2498050Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized
2025-12-04T12:22:43.2498494Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none
2025-12-04T12:22:43.2498941Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params
2025-12-04T12:22:43.2499385Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none
2025-12-04T12:22:43.2499803Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd
2025-12-04T12:22:43.2500149Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone
2025-12-04T12:22:43.2500496Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states
2025-12-04T12:22:43.2500845Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs
2025-12-04T12:22:43.2501183Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks
2025-12-04T12:22:43.2501510Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda
2025-12-04T12:22:43.2501834Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group
2025-12-04T12:22:43.2502173Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup
2025-12-04T12:22:43.2502577Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce
2025-12-04T12:22:43.2502989Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size
2025-12-04T12:22:43.2503416Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload
2025-12-04T12:22:43.2503783Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max
2025-12-04T12:22:43.2504119Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min
2025-12-04T12:22:43.2504471Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:22:43.2504672Z 
2025-12-04T12:22:43.2504764Z Running distributed tests for the gloo backend with file init_method
2025-12-04T12:22:43.2504935Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:22:43.2505361Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_distributed_spawn.py', '--shard-id=6', '--num-shards=7', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:22:43.247549]
2025-12-04T12:26:16.0997611Z 
2025-12-04T12:26:16.0998489Z distributed/test_distributed_spawn 6/7 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_distributed_spawn_6.7_e2aa3221fb17d374_.log
2025-12-04T12:26:16.1012008Z Running 43 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min, test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:26:16.1021548Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_1_level_hierarchical_model_averager_equivalent_to_periodic_model_averager
2025-12-04T12:26:16.1022048Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_3_level_hierarchical_model_averager
2025-12-04T12:26:16.1022460Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_Backend_enum_class
2025-12-04T12:26:16.1022862Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_coalesced_complex
2025-12-04T12:26:16.1023262Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_full_group
2025-12-04T12:26:16.1023664Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_into_cat_tensor_cuda
2025-12-04T12:26:16.1024079Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_gather_object_subgroup
2025-12-04T12:26:16.1024478Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_min
2025-12-04T12:26:16.1024880Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_coalesced_product
2025-12-04T12:26:16.1025289Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_reduce_full_group_product
2025-12-04T12:26:16.1025722Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_equal_split_group_cuda
2025-12-04T12:26:16.1026207Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split
2025-12-04T12:26:16.1026643Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_full_group
2025-12-04T12:26:16.1027087Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_all_to_all_single_unequal_split_group
2025-12-04T12:26:16.1027493Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_full_group
2025-12-04T12:26:16.1027883Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_barrier_timeout_full_group
2025-12-04T12:26:16.1028324Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_batch_isend_irecv_op_list_err
2025-12-04T12:26:16.1028720Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_cuda
2025-12-04T12:26:16.1029087Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_broadcast_group
2025-12-04T12:26:16.1029518Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_broadcast_buffer_via_hook
2025-12-04T12:26:16.1029961Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping
2025-12-04T12:26:16.1030425Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_build_debug_param_to_name_mapping_requires_grad
2025-12-04T12:26:16.1030887Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_control_flow_different_across_ranks
2025-12-04T12:26:16.1031310Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_forward_backward_hook
2025-12-04T12:26:16.1031674Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_grad_div_uneven_inputs
2025-12-04T12:26:16.1032027Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_has_finalized
2025-12-04T12:26:16.1032429Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_grad_as_bucket_view_set_grad_to_none
2025-12-04T12:26:16.1032869Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_ignored_params
2025-12-04T12:26:16.1033313Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_native_mixed_precision_no_grad_as_bucket_view_no_set_grad_none
2025-12-04T12:26:16.1033734Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_new_tensor_in_fwd
2025-12-04T12:26:16.1034079Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sink_noclone
2025-12-04T12:26:16.1034426Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_sync_module_states
2025-12-04T12:26:16.1034774Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_ddp_uneven_inputs
2025-12-04T12:26:16.1035110Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_checks
2025-12-04T12:26:16.1035437Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_cuda
2025-12-04T12:26:16.1035761Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_group
2025-12-04T12:26:16.1036133Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_gather_object_subgroup
2025-12-04T12:26:16.1036498Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_nccl_backend_bool_allreduce
2025-12-04T12:26:16.1036913Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_new_subgroups_by_enumeration_input_rank_exceeds_world_size
2025-12-04T12:26:16.1037338Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_post_localSGD_optimizer_step_reload
2025-12-04T12:26:16.1037704Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_max
2025-12-04T12:26:16.1038040Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_reduce_group_min
2025-12-04T12:26:16.1038433Z Running 1 items in this shard: test/distributed/test_distributed_spawn.py::TestDistBackendWithSpawn::test_static_graph_multi_forward
2025-12-04T12:26:16.1038636Z 
2025-12-04T12:26:16.1038809Z Finished distributed/test_distributed_spawn 6/7 ... [2025-12-04 12:26:16.100672][2290674.749850741], took 15.22min
2025-12-04T12:26:16.1039258Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:26:16.1039658Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:26:16.1039878Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:26:16.1040059Z Uploading artifacts took 0.00 seconds
2025-12-04T12:26:16.1041578Z Running distributed/fsdp/test_fsdp_traversal 1/1 ... [2025-12-04 12:26:16.104064][2290674.753248032]
2025-12-04T12:26:16.1041791Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:26:16.1043134Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_traversal.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:26:16.104229]
2025-12-04T12:26:41.5012658Z 
2025-12-04T12:26:41.5013381Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal 1/1 (test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_ef9ad764013e9636_.log)
2025-12-04T12:26:41.5014017Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-fdadd662e8b4052c.xml
2025-12-04T12:26:41.5014444Z ============================= test session starts ==============================
2025-12-04T12:26:41.5014781Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:26:41.5015043Z cachedir: .pytest_cache
2025-12-04T12:26:41.5015362Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:26:41.5015696Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:26:41.5015864Z configfile: pytest.ini
2025-12-04T12:26:41.5016210Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:26:41.5016621Z collecting ... collected 1 item
2025-12-04T12:26:41.5016813Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T12:26:41.5017178Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda
2025-12-04T12:26:41.5017432Z 
2025-12-04T12:26:41.5017811Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:26:17.817000 448654 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448723
2025-12-04T12:26:41.5019017Z I1204 12:26:17.818000 448654 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448724
2025-12-04T12:26:41.5019475Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:26:41.5019940Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:26:41.5020598Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5021302Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:26:41.5021978Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5022545Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:26:41.5023190Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5023826Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5024412Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5024993Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5025579Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5026149Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:26:41.5026721Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5027305Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:26:41.5028095Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5028882Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5029323Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5030019Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5030610Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5031118Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5031543Z [rank1]:E1204 12:26:21.591000 448724 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:26:41.5031792Z dist init r=1, world=2
2025-12-04T12:26:41.5032001Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:26:41.5032372Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:26:41.5032904Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5033432Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:26:41.5033918Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5034415Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:26:41.5034858Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5035327Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5035801Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5036267Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5036734Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5037194Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:26:41.5037653Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5038123Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:26:41.5038791Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528.
2025-12-04T12:26:41.5039379Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5039731Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5040286Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5040790Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5041155Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5041569Z [rank0]:E1204 12:26:21.653000 448723 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:26:41.5041809Z dist init r=0, world=2
2025-12-04T12:26:41.5042227Z [rank0]:[W1204 12:26:21.509278405 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:26:41.5042650Z FAILED [5.2094s] [100%]
2025-12-04T12:26:41.5042715Z 
2025-12-04T12:26:41.5042781Z =================================== FAILURES ===================================
2025-12-04T12:26:41.5042967Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________
2025-12-04T12:26:41.5043161Z Traceback (most recent call last):
2025-12-04T12:26:41.5043409Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:26:41.5043730Z     self._join_processes(fn)
2025-12-04T12:26:41.5043978Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:26:41.5044242Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:26:41.5044507Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:26:41.5044765Z     raise RuntimeError(error)
2025-12-04T12:26:41.5044917Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:26:41.5045081Z Traceback (most recent call last):
2025-12-04T12:26:41.5045320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5045562Z     getattr(self, test_name)()
2025-12-04T12:26:41.5045794Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5046026Z     fn()
2025-12-04T12:26:41.5046229Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5046460Z     method(*args, **kwargs)
2025-12-04T12:26:41.5046681Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5046910Z     method(*args, **kwargs)
2025-12-04T12:26:41.5047127Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5047355Z     with policy():
2025-12-04T12:26:41.5047567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5047798Z     raise RuntimeError(msg)
2025-12-04T12:26:41.5048218Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5048560Z 
2025-12-04T12:26:41.5048639Z To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5048945Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5049175Z 
2025-12-04T12:26:41.5049267Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5049393Z 
2025-12-04T12:26:41.5049395Z 
2025-12-04T12:26:41.5049522Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:26:41.5049728Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:26:41.5050106Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-fdadd662e8b4052c.xml -
2025-12-04T12:26:41.5050455Z =========================== short test summary info ============================
2025-12-04T12:26:41.5050769Z FAILED [5.2094s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:26:41.5051060Z Traceback (most recent call last):
2025-12-04T12:26:41.5051305Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5051547Z     getattr(self, test_name)()
2025-12-04T12:26:41.5051781Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5052077Z     fn()
2025-12-04T12:26:41.5052278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5052532Z     method(*args, **kwargs)
2025-12-04T12:26:41.5052750Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5052976Z     method(*args, **kwargs)
2025-12-04T12:26:41.5053191Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5053415Z     with policy():
2025-12-04T12:26:41.5053625Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5053854Z     raise RuntimeError(msg)
2025-12-04T12:26:41.5054303Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5054649Z 
2025-12-04T12:26:41.5054724Z To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5055028Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5055255Z 
2025-12-04T12:26:41.5055344Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5055565Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:26:41.5055723Z ============================== 1 failed in 5.35s ===============================
2025-12-04T12:26:41.5055855Z Got exit code 1
2025-12-04T12:26:41.5055956Z Retrying single test...
2025-12-04T12:26:41.5056224Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-168649b91fa6c9f9.xml
2025-12-04T12:26:41.5056523Z ============================= test session starts ==============================
2025-12-04T12:26:41.5056735Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:26:41.5056922Z cachedir: .pytest_cache
2025-12-04T12:26:41.5057145Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:26:41.5057382Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:26:41.5057499Z configfile: pytest.ini
2025-12-04T12:26:41.5057725Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:26:41.5057966Z collecting ... collected 1 item
2025-12-04T12:26:41.5058293Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda
2025-12-04T12:26:41.5058556Z Running 1 items in this shard
2025-12-04T12:26:41.5058628Z 
2025-12-04T12:26:41.5058901Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:26:25.424000 448882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 448951
2025-12-04T12:26:41.5059366Z I1204 12:26:25.424000 448882 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 448952
2025-12-04T12:26:41.5059699Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:26:41.5060040Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:26:41.5060533Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5061028Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:26:41.5061526Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5061973Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:26:41.5062420Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5062889Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5063352Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5063814Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5064275Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5064725Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:26:41.5065180Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5065645Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:26:41.5066271Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528.
2025-12-04T12:26:41.5066852Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5067225Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5067776Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5068292Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5068658Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5069072Z [rank0]:E1204 12:26:29.237000 448951 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:26:41.5069311Z dist init r=0, world=2
2025-12-04T12:26:41.5069518Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:26:41.5069855Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:26:41.5070357Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5070854Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:26:41.5071332Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5071778Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:26:41.5072218Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5072682Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5073147Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5073612Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5074073Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5074524Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:26:41.5074980Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5075445Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:26:41.5076066Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5076673Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5077030Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5077583Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5078053Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5078461Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5078883Z [rank1]:E1204 12:26:29.291000 448952 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:26:41.5079127Z dist init r=1, world=2
2025-12-04T12:26:41.5079547Z [rank0]:[W1204 12:26:29.077781766 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:26:41.5079977Z FAILED [5.2096s] [100%]
2025-12-04T12:26:41.5080040Z 
2025-12-04T12:26:41.5080099Z =================================== FAILURES ===================================
2025-12-04T12:26:41.5080282Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________
2025-12-04T12:26:41.5080455Z Traceback (most recent call last):
2025-12-04T12:26:41.5080709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:26:41.5080959Z     self._join_processes(fn)
2025-12-04T12:26:41.5081214Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:26:41.5081487Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:26:41.5081763Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:26:41.5082031Z     raise RuntimeError(error)
2025-12-04T12:26:41.5082191Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:26:41.5082360Z Traceback (most recent call last):
2025-12-04T12:26:41.5082607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5082857Z     getattr(self, test_name)()
2025-12-04T12:26:41.5083097Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5083335Z     fn()
2025-12-04T12:26:41.5083546Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5083786Z     method(*args, **kwargs)
2025-12-04T12:26:41.5084015Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5084255Z     method(*args, **kwargs)
2025-12-04T12:26:41.5084480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5084712Z     with policy():
2025-12-04T12:26:41.5084931Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5085169Z     raise RuntimeError(msg)
2025-12-04T12:26:41.5085590Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528.
2025-12-04T12:26:41.5085935Z 
2025-12-04T12:26:41.5086014Z To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5086322Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5086552Z 
2025-12-04T12:26:41.5086642Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5086767Z 
2025-12-04T12:26:41.5086769Z 
2025-12-04T12:26:41.5086849Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:26:41.5087052Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:26:41.5087426Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-168649b91fa6c9f9.xml -
2025-12-04T12:26:41.5087779Z =========================== short test summary info ============================
2025-12-04T12:26:41.5088110Z FAILED [5.2096s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:26:41.5088547Z Traceback (most recent call last):
2025-12-04T12:26:41.5088827Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5089350Z     getattr(self, test_name)()
2025-12-04T12:26:41.5089628Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5089895Z     fn()
2025-12-04T12:26:41.5090164Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5090431Z     method(*args, **kwargs)
2025-12-04T12:26:41.5090696Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5090992Z     method(*args, **kwargs)
2025-12-04T12:26:41.5091249Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5091529Z     with policy():
2025-12-04T12:26:41.5091786Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5092057Z     raise RuntimeError(msg)
2025-12-04T12:26:41.5092486Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528.
2025-12-04T12:26:41.5092850Z 
2025-12-04T12:26:41.5092937Z To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5093275Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5093551Z 
2025-12-04T12:26:41.5093649Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5093884Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:26:41.5094085Z ============================== 1 failed in 5.36s ===============================
2025-12-04T12:26:41.5094263Z Got exit code 1
2025-12-04T12:26:41.5094404Z Retrying single test...
2025-12-04T12:26:41.5094736Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-c6de5f5f6db77275.xml
2025-12-04T12:26:41.5095138Z ============================= test session starts ==============================
2025-12-04T12:26:41.5095412Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:26:41.5095670Z cachedir: .pytest_cache
2025-12-04T12:26:41.5095930Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:26:41.5096242Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:26:41.5096387Z configfile: pytest.ini
2025-12-04T12:26:41.5096650Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:26:41.5096961Z collecting ... collected 1 item
2025-12-04T12:26:41.5097258Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda
2025-12-04T12:26:41.5097563Z Running 1 items in this shard
2025-12-04T12:26:41.5097661Z 
2025-12-04T12:26:41.5097960Z distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda I1204 12:26:33.302000 449110 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449179
2025-12-04T12:26:41.5098503Z I1204 12:26:33.303000 449110 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449180
2025-12-04T12:26:41.5098910Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:26:41.5099306Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:26:41.5099832Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5100370Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:26:41.5100890Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5101383Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:26:41.5101869Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5102374Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5102894Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5103399Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5103955Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5104451Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:26:41.5104948Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5105512Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:26:41.5106257Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5106872Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5107283Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5107872Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5108422Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5108836Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5109303Z [rank1]:E1204 12:26:37.131000 449180 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:26:41.5109616Z dist init r=1, world=2
2025-12-04T12:26:41.5109859Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:26:41.5110259Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:26:41.5110788Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5111300Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:26:41.5111850Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5112336Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:26:41.5112815Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5113326Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5113828Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5114335Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:26:41.5114842Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5115339Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:26:41.5115872Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5116396Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:26:41.5117063Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 0. CUDA driver allocated memory was 2017460224 and is now 2021654528.
2025-12-04T12:26:41.5117684Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5118073Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5118721Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5119250Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:26:41.5119658Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5120132Z [rank0]:E1204 12:26:37.194000 449179 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:26:41.5120409Z dist init r=0, world=2
2025-12-04T12:26:41.5120856Z [rank0]:[W1204 12:26:37.066340097 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:26:41.5121314Z FAILED [5.2103s] [100%]
2025-12-04T12:26:41.5121404Z 
2025-12-04T12:26:41.5121472Z =================================== FAILURES ===================================
2025-12-04T12:26:41.5121715Z ___________________ TestTraversalCUDA.test_fsdp_modules_cuda ___________________
2025-12-04T12:26:41.5121920Z Traceback (most recent call last):
2025-12-04T12:26:41.5122197Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:26:41.5122504Z     self._join_processes(fn)
2025-12-04T12:26:41.5122788Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:26:41.5123101Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:26:41.5123418Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:26:41.5123724Z     raise RuntimeError(error)
2025-12-04T12:26:41.5123941Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:26:41.5124129Z Traceback (most recent call last):
2025-12-04T12:26:41.5124436Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5124722Z     getattr(self, test_name)()
2025-12-04T12:26:41.5124995Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5125275Z     fn()
2025-12-04T12:26:41.5125516Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5125785Z     method(*args, **kwargs)
2025-12-04T12:26:41.5126064Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5126335Z     method(*args, **kwargs)
2025-12-04T12:26:41.5126629Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5126902Z     with policy():
2025-12-04T12:26:41.5127158Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5127445Z     raise RuntimeError(msg)
2025-12-04T12:26:41.5127865Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5128257Z 
2025-12-04T12:26:41.5128354Z To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5128719Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5128966Z 
2025-12-04T12:26:41.5129080Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5129229Z 
2025-12-04T12:26:41.5129230Z 
2025-12-04T12:26:41.5129346Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:26:41.5129608Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:26:41.5130015Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-c6de5f5f6db77275.xml -
2025-12-04T12:26:41.5130415Z =========================== short test summary info ============================
2025-12-04T12:26:41.5130762Z FAILED [5.2103s] distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:26:41.5131084Z Traceback (most recent call last):
2025-12-04T12:26:41.5131396Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:26:41.5131679Z     getattr(self, test_name)()
2025-12-04T12:26:41.5131965Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:26:41.5132241Z     fn()
2025-12-04T12:26:41.5132479Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5132764Z     method(*args, **kwargs)
2025-12-04T12:26:41.5133018Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:26:41.5133278Z     method(*args, **kwargs)
2025-12-04T12:26:41.5133551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:26:41.5133815Z     with policy():
2025-12-04T12:26:41.5134081Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:26:41.5134358Z     raise RuntimeError(msg)
2025-12-04T12:26:41.5134776Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestTraversalCUDA.test_fsdp_modules_cuda! Caching allocator allocated memory was 512 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1864368128 and is now 1868562432.
2025-12-04T12:26:41.5135151Z 
2025-12-04T12:26:41.5135257Z To execute this test, run the following from the base repo dir:
2025-12-04T12:26:41.5135604Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_traversal.py TestTraversalCUDA.test_fsdp_modules_cuda
2025-12-04T12:26:41.5135842Z 
2025-12-04T12:26:41.5143185Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:26:41.5143449Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:26:41.5143696Z ============================== 1 failed in 5.37s ===============================
2025-12-04T12:26:41.5143840Z Got exit code 1
2025-12-04T12:26:41.5144059Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda
2025-12-04T12:26:41.5144390Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:26:41.5144769Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-eecea1a21e81fd93.xml
2025-12-04T12:26:41.5145077Z ============================= test session starts ==============================
2025-12-04T12:26:41.5145293Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:26:41.5145486Z cachedir: .pytest_cache
2025-12-04T12:26:41.5145718Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:26:41.5145962Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:26:41.5146106Z configfile: pytest.ini
2025-12-04T12:26:41.5146338Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:26:41.5146631Z collecting ... collected 1 item / 1 deselected / 0 selected
2025-12-04T12:26:41.5146798Z stepcurrent: skipping 1 already run items.
2025-12-04T12:26:41.5146932Z Running 0 items in this shard
2025-12-04T12:26:41.5147010Z 
2025-12-04T12:26:41.5147265Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_traversal/distributed.fsdp.test_fsdp_traversal-eecea1a21e81fd93.xml -
2025-12-04T12:26:41.5147615Z ============================ 1 deselected in 0.00s =============================
2025-12-04T12:26:41.5147888Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_traversal.py::TestTraversalCUDA::test_fsdp_modules_cuda']
2025-12-04T12:26:41.5148102Z 
2025-12-04T12:26:41.5148343Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_traversal 1/1 (test/test-reports/distributed.fsdp.test_fsdp_traversal_1.1_ef9ad764013e9636_.log)
2025-12-04T12:26:41.5148579Z 
2025-12-04T12:26:41.5148717Z Finished distributed/fsdp/test_fsdp_traversal 1/1 ... [2025-12-04 12:26:41.500951][2290700.150130212], took 0.42min
2025-12-04T12:26:41.5149160Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:26:41.5149549Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:26:41.5149771Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:26:41.5149954Z Uploading artifacts took 0.00 seconds
2025-12-04T12:26:41.5150098Z distributed/fsdp/test_fsdp_traversal 1/1 failed!
2025-12-04T12:26:41.5150306Z Running distributed/test_serialization 1/1 ... [2025-12-04 12:26:41.504565][2290700.15374837]
2025-12-04T12:26:41.5150504Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:26:41.5150906Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_serialization.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:26:41.504754]
2025-12-04T12:26:43.9231003Z 
2025-12-04T12:26:43.9231794Z distributed/test_serialization 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_serialization_1.1_b8711cdeeb133aaa_.log
2025-12-04T12:26:43.9235145Z Running 11 items in this shard: test/distributed/test_serialization.py::TestSerialization::test_cuda, test/distributed/test_serialization.py::TestSerialization::test_dtensor, test/distributed/test_serialization.py::TestSerialization::test_empty_tensor, test/distributed/test_serialization.py::TestSerialization::test_nested_tensors, test/distributed/test_serialization.py::TestSerialization::test_python_object, test/distributed/test_serialization.py::TestSerialization::test_scalar_tensor, test/distributed/test_serialization.py::TestSerialization::test_str_utf8, test/distributed/test_serialization.py::TestSerialization::test_strided_tensor, test/distributed/test_serialization.py::TestSerialization::test_tensor_with_offset, test/distributed/test_serialization.py::TestSerialization::test_various_data_types, test/distributed/test_serialization.py::TestSerialization::test_weights_only
2025-12-04T12:26:43.9237426Z 
2025-12-04T12:26:43.9237661Z Finished distributed/test_serialization 1/1 ... [2025-12-04 12:26:43.922746][2290702.571925858], took 0.04min
2025-12-04T12:26:43.9249395Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:26:43.9276187Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:26:43.9280016Z Running distributed/fsdp/test_fsdp_multiple_wrapping 1/1 ... [2025-12-04 12:26:43.927796][2290702.576973201]
2025-12-04T12:26:43.9281658Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:26:43.9282241Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_multiple_wrapping.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:26:43.928082]
2025-12-04T12:27:16.6851764Z 
2025-12-04T12:27:16.6856148Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping 1/1 (test/test-reports/distributed.fsdp.test_fsdp_multiple_wrapping_1.1_7d9d262da9a8dffa_.log)
2025-12-04T12:27:16.6856992Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-65c7637dc0619de0.xml
2025-12-04T12:27:16.6857472Z ============================= test session starts ==============================
2025-12-04T12:27:16.6857773Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:27:16.6858038Z cachedir: .pytest_cache
2025-12-04T12:27:16.6858418Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:27:16.6858755Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:27:16.6858922Z configfile: pytest.ini
2025-12-04T12:27:16.6859228Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:27:16.6859555Z collecting ... collected 1 item
2025-12-04T12:27:16.6859745Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T12:27:16.6860179Z Running 1 items in this shard: test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda
2025-12-04T12:27:16.6860478Z 
2025-12-04T12:27:16.6860904Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 12:26:45.680000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449545
2025-12-04T12:27:16.6861610Z I1204 12:26:45.681000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449546
2025-12-04T12:27:16.6862073Z I1204 12:26:45.682000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 449547
2025-12-04T12:27:16.6862532Z I1204 12:26:45.682000 449476 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 449548
2025-12-04T12:27:16.6865419Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6866222Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6867021Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6867657Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6868338Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6869016Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6869640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6870331Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6870592Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6870971Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6871506Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6872034Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6872557Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6873042Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6873524Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6874032Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6874535Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6875036Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6875535Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6876081Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6876572Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6877058Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6877712Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296.
2025-12-04T12:27:16.6878374Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6878733Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6879356Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6879890Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6880261Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6880680Z [rank3]:E1204 12:26:51.987000 449548 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:27:16.6880927Z dist init r=3, world=4
2025-12-04T12:27:16.6881138Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6881479Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6881971Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6882452Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6882932Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6883387Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6883902Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6884369Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6884833Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6885298Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6885804Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6886260Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6886717Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6887183Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6887832Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160.
2025-12-04T12:27:16.6888488Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6888867Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6889462Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6889968Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6890337Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6890752Z [rank1]:E1204 12:26:52.025000 449546 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:27:16.6890998Z dist init r=1, world=4
2025-12-04T12:27:16.6891204Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6891542Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6892029Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6892513Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6892994Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6893443Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6893883Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6894351Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6894857Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6895323Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6895788Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6896242Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6896696Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6897167Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6897812Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496.
2025-12-04T12:27:16.6898513Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6898866Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6899482Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6899988Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6900354Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6900771Z [rank0]:E1204 12:26:52.030000 449545 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:27:16.6901112Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6901447Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6901936Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6902418Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6902896Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6903344Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6903790Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6904294Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6904767Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6905237Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6905705Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6906161Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6906627Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6907110Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6907778Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944.
2025-12-04T12:27:16.6908423Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6908777Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6909371Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6909880Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6910248Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6910668Z [rank2]:E1204 12:26:52.031000 449547 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:27:16.6910916Z dist init r=0, world=4
2025-12-04T12:27:16.6911024Z dist init r=2, world=4
2025-12-04T12:27:16.6911130Z FAILED [7.4135s] [100%]
2025-12-04T12:27:16.6911201Z 
2025-12-04T12:27:16.6911263Z =================================== FAILURES ===================================
2025-12-04T12:27:16.6911463Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________
2025-12-04T12:27:16.6911650Z Traceback (most recent call last):
2025-12-04T12:27:16.6911904Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:27:16.6912156Z     self._join_processes(fn)
2025-12-04T12:27:16.6912405Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:27:16.6912672Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:27:16.6912939Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:27:16.6913201Z     raise RuntimeError(error)
2025-12-04T12:27:16.6913360Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:27:16.6913560Z Traceback (most recent call last):
2025-12-04T12:27:16.6913807Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6914050Z     getattr(self, test_name)()
2025-12-04T12:27:16.6914289Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6914521Z     fn()
2025-12-04T12:27:16.6914727Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6914958Z     method(*args, **kwargs)
2025-12-04T12:27:16.6915182Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6915412Z     method(*args, **kwargs)
2025-12-04T12:27:16.6915631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6915858Z     with policy():
2025-12-04T12:27:16.6916070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6916326Z     raise RuntimeError(msg)
2025-12-04T12:27:16.6916739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296.
2025-12-04T12:27:16.6917102Z 
2025-12-04T12:27:16.6917177Z To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6917515Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6917781Z 
2025-12-04T12:27:16.6917872Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6918003Z 
2025-12-04T12:27:16.6918005Z 
2025-12-04T12:27:16.6918087Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:27:16.6918328Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:27:16.6918733Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-65c7637dc0619de0.xml -
2025-12-04T12:27:16.6919101Z =========================== short test summary info ============================
2025-12-04T12:27:16.6919448Z FAILED [7.4135s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:27:16.6919775Z Traceback (most recent call last):
2025-12-04T12:27:16.6920023Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6920267Z     getattr(self, test_name)()
2025-12-04T12:27:16.6920502Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6920739Z     fn()
2025-12-04T12:27:16.6920943Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6921173Z     method(*args, **kwargs)
2025-12-04T12:27:16.6921396Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6921625Z     method(*args, **kwargs)
2025-12-04T12:27:16.6921843Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6922070Z     with policy():
2025-12-04T12:27:16.6922320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6922553Z     raise RuntimeError(msg)
2025-12-04T12:27:16.6922954Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296.
2025-12-04T12:27:16.6923315Z 
2025-12-04T12:27:16.6923392Z To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6923729Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6923988Z 
2025-12-04T12:27:16.6924078Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6924267Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:27:16.6924429Z ============================== 1 failed in 7.42s ===============================
2025-12-04T12:27:16.6924563Z Got exit code 1
2025-12-04T12:27:16.6924678Z Retrying single test...
2025-12-04T12:27:16.6924971Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-95628f74af187d69.xml
2025-12-04T12:27:16.6925304Z ============================= test session starts ==============================
2025-12-04T12:27:16.6925516Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:27:16.6925707Z cachedir: .pytest_cache
2025-12-04T12:27:16.6925932Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:27:16.6926170Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:27:16.6926290Z configfile: pytest.ini
2025-12-04T12:27:16.6926522Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:27:16.6926767Z collecting ... collected 1 item
2025-12-04T12:27:16.6927061Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda
2025-12-04T12:27:16.6927360Z Running 1 items in this shard
2025-12-04T12:27:16.6927434Z 
2025-12-04T12:27:16.6927740Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 12:26:55.644000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 449939
2025-12-04T12:27:16.6928272Z I1204 12:26:55.645000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 449940
2025-12-04T12:27:16.6928615Z I1204 12:26:55.645000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 449941
2025-12-04T12:27:16.6928956Z I1204 12:26:55.646000 449870 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 449942
2025-12-04T12:27:16.6929649Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6930236Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6930860Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6931443Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6932024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6932605Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6933184Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6933761Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6934044Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6934386Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6934891Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6935376Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6935857Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6936306Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6936750Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6937217Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6937681Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6938188Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6938655Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6939109Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6939565Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6940031Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6940708Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2464153600 and is now 3456106496.
2025-12-04T12:27:16.6941314Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6941665Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6942257Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6942761Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6943134Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6943572Z [rank0]:E1204 12:27:01.891000 449939 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:27:16.6943832Z dist init r=0, world=4
2025-12-04T12:27:16.6944037Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6944376Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6944867Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6945345Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6945823Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6946271Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6946712Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6947176Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6947644Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6948108Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6948617Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6949068Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6949530Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6950031Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6950682Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160.
2025-12-04T12:27:16.6951284Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6951635Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6952226Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6952743Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6953122Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6953537Z [rank1]:E1204 12:27:01.901000 449940 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:27:16.6953778Z dist init r=1, world=4
2025-12-04T12:27:16.6953983Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6954320Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6954809Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6955290Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6955772Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6956218Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6956660Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6957125Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6957590Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6958058Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6958596Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6959046Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6959543Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6960012Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6960656Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944.
2025-12-04T12:27:16.6961258Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6961610Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6962209Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6962725Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6963092Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6963507Z [rank2]:E1204 12:27:01.944000 449941 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:27:16.6963849Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6964185Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6964672Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6965149Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6965624Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6966072Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6966511Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6966976Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6967436Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6967897Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.6968429Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6968884Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.6969337Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6969800Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.6970442Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296.
2025-12-04T12:27:16.6971042Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6971404Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6972004Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6972505Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.6972870Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6973282Z [rank3]:E1204 12:27:01.944000 449942 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:27:16.6973524Z dist init r=2, world=4
2025-12-04T12:27:16.6973625Z dist init r=3, world=4
2025-12-04T12:27:16.6973724Z FAILED [7.3131s] [100%]
2025-12-04T12:27:16.6973791Z 
2025-12-04T12:27:16.6973851Z =================================== FAILURES ===================================
2025-12-04T12:27:16.6974042Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________
2025-12-04T12:27:16.6974220Z Traceback (most recent call last):
2025-12-04T12:27:16.6974463Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:27:16.6974705Z     self._join_processes(fn)
2025-12-04T12:27:16.6974952Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:27:16.6975216Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:27:16.6975485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:27:16.6975747Z     raise RuntimeError(error)
2025-12-04T12:27:16.6975901Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:27:16.6976063Z Traceback (most recent call last):
2025-12-04T12:27:16.6976303Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6976546Z     getattr(self, test_name)()
2025-12-04T12:27:16.6976780Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6977011Z     fn()
2025-12-04T12:27:16.6977212Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6977470Z     method(*args, **kwargs)
2025-12-04T12:27:16.6977691Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6977921Z     method(*args, **kwargs)
2025-12-04T12:27:16.6978138Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6978419Z     with policy():
2025-12-04T12:27:16.6978631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6978860Z     raise RuntimeError(msg)
2025-12-04T12:27:16.6979257Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2464153600 and is now 3456106496.
2025-12-04T12:27:16.6979619Z 
2025-12-04T12:27:16.6979697Z To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6980033Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6980328Z 
2025-12-04T12:27:16.6980416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6980542Z 
2025-12-04T12:27:16.6980544Z 
2025-12-04T12:27:16.6980621Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:27:16.6980823Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:27:16.6981219Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-95628f74af187d69.xml -
2025-12-04T12:27:16.6981587Z =========================== short test summary info ============================
2025-12-04T12:27:16.6981939Z FAILED [7.3131s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:27:16.6982263Z Traceback (most recent call last):
2025-12-04T12:27:16.6982507Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6982751Z     getattr(self, test_name)()
2025-12-04T12:27:16.6982980Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6983213Z     fn()
2025-12-04T12:27:16.6983412Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6983639Z     method(*args, **kwargs)
2025-12-04T12:27:16.6983857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6984087Z     method(*args, **kwargs)
2025-12-04T12:27:16.6984304Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.6984529Z     with policy():
2025-12-04T12:27:16.6984740Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.6984972Z     raise RuntimeError(msg)
2025-12-04T12:27:16.6985369Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2464153600 and is now 3456106496.
2025-12-04T12:27:16.6985731Z 
2025-12-04T12:27:16.6985808Z To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.6986180Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.6986440Z 
2025-12-04T12:27:16.6986532Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.6986720Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:27:16.6986877Z ============================== 1 failed in 7.32s ===============================
2025-12-04T12:27:16.6987009Z Got exit code 1
2025-12-04T12:27:16.6987108Z Retrying single test...
2025-12-04T12:27:16.6987402Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-ce0028572e67d7a8.xml
2025-12-04T12:27:16.6987811Z ============================= test session starts ==============================
2025-12-04T12:27:16.6988021Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:27:16.6988247Z cachedir: .pytest_cache
2025-12-04T12:27:16.6988474Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:27:16.6988734Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:27:16.6988853Z configfile: pytest.ini
2025-12-04T12:27:16.6989094Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:27:16.6989335Z collecting ... collected 1 item
2025-12-04T12:27:16.6989628Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda
2025-12-04T12:27:16.6989923Z Running 1 items in this shard
2025-12-04T12:27:16.6989995Z 
2025-12-04T12:27:16.6990301Z distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda I1204 12:27:05.605000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 450333
2025-12-04T12:27:16.6990801Z I1204 12:27:05.606000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 450334
2025-12-04T12:27:16.6991147Z I1204 12:27:05.606000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 450335
2025-12-04T12:27:16.6991488Z I1204 12:27:05.607000 450264 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 450336
2025-12-04T12:27:16.6992171Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6992753Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6993336Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6993917Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6994497Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6995073Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6995684Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:27:16.6996263Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:27:16.6996501Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.6996842Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.6997330Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.6997811Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.6998354Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.6998833Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.6999272Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.6999737Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7000205Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7000669Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7001134Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.7001587Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.7002041Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.7002506Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.7003191Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160.
2025-12-04T12:27:16.7003835Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7004188Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.7004803Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.7005307Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7005671Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.7006083Z [rank1]:E1204 12:27:11.990000 450334 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:27:16.7006324Z dist init r=1, world=4
2025-12-04T12:27:16.7006526Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.7006864Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.7007351Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.7007856Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.7008360Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.7008805Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.7009248Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7009711Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7010178Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7010639Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7011101Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.7011554Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.7012010Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.7012479Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.7013124Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 3. CUDA driver allocated memory was 2250244096 and is now 3246391296.
2025-12-04T12:27:16.7013730Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7014108Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.7014696Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.7015200Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7015563Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.7015976Z [rank3]:E1204 12:27:12.046000 450336 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:27:16.7016218Z dist init r=3, world=4
2025-12-04T12:27:16.7016422Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.7016774Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.7017272Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.7017750Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.7018259Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.7018710Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.7019149Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7019611Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7020074Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7020535Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7021004Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.7021455Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.7021908Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.7022372Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.7023039Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 2. CUDA driver allocated memory was 2300575744 and is now 3296722944.
2025-12-04T12:27:16.7023643Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7023993Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.7024576Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.7025075Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7025442Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.7025866Z [rank2]:E1204 12:27:12.047000 450335 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:27:16.7026127Z dist init r=2, world=4
2025-12-04T12:27:16.7026328Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:27:16.7026662Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:27:16.7027147Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.7027628Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:27:16.7028109Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.7028592Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:27:16.7029030Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7029494Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7029958Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7030419Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:27:16.7030879Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.7031328Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:27:16.7031779Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.7032272Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:27:16.7032912Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 0. CUDA driver allocated memory was 2459959296 and is now 3456106496.
2025-12-04T12:27:16.7033517Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7033864Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.7034449Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.7034949Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:27:16.7035331Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.7035756Z [rank0]:E1204 12:27:12.056000 450333 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:27:16.7035996Z dist init r=0, world=4
2025-12-04T12:27:16.7036096Z FAILED [7.5129s] [100%]
2025-12-04T12:27:16.7036160Z 
2025-12-04T12:27:16.7036221Z =================================== FAILURES ===================================
2025-12-04T12:27:16.7036413Z _____________ TestMultipleWrappingCUDA.test_multiple_wrapping_cuda _____________
2025-12-04T12:27:16.7036594Z Traceback (most recent call last):
2025-12-04T12:27:16.7036839Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:27:16.7037083Z     self._join_processes(fn)
2025-12-04T12:27:16.7037328Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:27:16.7037593Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:27:16.7037859Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:27:16.7038117Z     raise RuntimeError(error)
2025-12-04T12:27:16.7038301Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:27:16.7038462Z Traceback (most recent call last):
2025-12-04T12:27:16.7038701Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.7038942Z     getattr(self, test_name)()
2025-12-04T12:27:16.7039174Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.7039405Z     fn()
2025-12-04T12:27:16.7039607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7039838Z     method(*args, **kwargs)
2025-12-04T12:27:16.7040058Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7040288Z     method(*args, **kwargs)
2025-12-04T12:27:16.7040506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.7040732Z     with policy():
2025-12-04T12:27:16.7040945Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.7041175Z     raise RuntimeError(msg)
2025-12-04T12:27:16.7041606Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160.
2025-12-04T12:27:16.7041972Z 
2025-12-04T12:27:16.7042047Z To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.7042381Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.7042641Z 
2025-12-04T12:27:16.7042730Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.7042855Z 
2025-12-04T12:27:16.7042857Z 
2025-12-04T12:27:16.7042934Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:27:16.7043133Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:27:16.7043530Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-ce0028572e67d7a8.xml -
2025-12-04T12:27:16.7043910Z =========================== short test summary info ============================
2025-12-04T12:27:16.7044269Z FAILED [7.5129s] distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:27:16.7044594Z Traceback (most recent call last):
2025-12-04T12:27:16.7044839Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:27:16.7045081Z     getattr(self, test_name)()
2025-12-04T12:27:16.7045312Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:27:16.7045544Z     fn()
2025-12-04T12:27:16.7045744Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7045975Z     method(*args, **kwargs)
2025-12-04T12:27:16.7046193Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:27:16.7046422Z     method(*args, **kwargs)
2025-12-04T12:27:16.7046639Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:27:16.7046862Z     with policy():
2025-12-04T12:27:16.7047073Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:27:16.7047302Z     raise RuntimeError(msg)
2025-12-04T12:27:16.7047702Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestMultipleWrappingCUDA.test_multiple_wrapping_cuda! Caching allocator allocated memory was 512 and is now reported as 1024 on device 1. CUDA driver allocated memory was 2317352960 and is now 3313500160.
2025-12-04T12:27:16.7048064Z 
2025-12-04T12:27:16.7048141Z To execute this test, run the following from the base repo dir:
2025-12-04T12:27:16.7048517Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_multiple_wrapping.py TestMultipleWrappingCUDA.test_multiple_wrapping_cuda
2025-12-04T12:27:16.7048776Z 
2025-12-04T12:27:16.7048865Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:27:16.7049053Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:27:16.7049211Z ============================== 1 failed in 7.52s ===============================
2025-12-04T12:27:16.7049347Z Got exit code 1
2025-12-04T12:27:16.7049581Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda
2025-12-04T12:27:16.7049951Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:27:16.7050341Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-e6f42f0989388e92.xml
2025-12-04T12:27:16.7050660Z ============================= test session starts ==============================
2025-12-04T12:27:16.7050868Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:27:16.7051054Z cachedir: .pytest_cache
2025-12-04T12:27:16.7051278Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:27:16.7051515Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:27:16.7051633Z configfile: pytest.ini
2025-12-04T12:27:16.7051863Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:27:16.7052129Z collecting ... collected 1 item / 1 deselected / 0 selected
2025-12-04T12:27:16.7052305Z stepcurrent: skipping 1 already run items.
2025-12-04T12:27:16.7052436Z Running 0 items in this shard
2025-12-04T12:27:16.7052508Z 
2025-12-04T12:27:16.7052797Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_multiple_wrapping/distributed.fsdp.test_fsdp_multiple_wrapping-e6f42f0989388e92.xml -
2025-12-04T12:27:16.7053160Z ============================ 1 deselected in 0.00s =============================
2025-12-04T12:27:16.7053461Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_multiple_wrapping.py::TestMultipleWrappingCUDA::test_multiple_wrapping_cuda']
2025-12-04T12:27:16.7053699Z 
2025-12-04T12:27:16.7053921Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_multiple_wrapping 1/1 (test/test-reports/distributed.fsdp.test_fsdp_multiple_wrapping_1.1_7d9d262da9a8dffa_.log)
2025-12-04T12:27:16.7054178Z 
2025-12-04T12:27:16.7054326Z Finished distributed/fsdp/test_fsdp_multiple_wrapping 1/1 ... [2025-12-04 12:27:16.685328][2290735.334506378], took 0.55min
2025-12-04T12:27:16.7054764Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:27:16.7055156Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:27:16.7055373Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:27:16.7055552Z Uploading artifacts took 0.00 seconds
2025-12-04T12:27:16.7055707Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 failed!
2025-12-04T12:27:16.7055933Z Running distributed/fsdp/test_fsdp_ignored_modules 1/1 ... [2025-12-04 12:27:16.689038][2290735.338222305]
2025-12-04T12:27:16.7056140Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:27:16.7056552Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_ignored_modules.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:27:16.689216]
2025-12-04T12:28:09.7319075Z 
2025-12-04T12:28:09.7319953Z distributed/fsdp/test_fsdp_ignored_modules 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_ignored_modules_1.1_7975e69e7f9e6ae8_.log
2025-12-04T12:28:09.7323529Z Running 8 items in this shard: test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_diff_ignored_modules_across_ranks, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_invalid, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_nested, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_not_under_wrapped_root_ignore_modules_False, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_not_under_wrapped_root_ignore_modules_True, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_modules_transformer, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_states_auto_wrap, test/distributed/fsdp/test_fsdp_ignored_modules.py::TestFSDPIgnoredModules::test_ignored_states_check
2025-12-04T12:28:09.7325643Z 
2025-12-04T12:28:09.7325850Z Finished distributed/fsdp/test_fsdp_ignored_modules 1/1 ... [2025-12-04 12:28:09.731602][2290788.380781929], took 0.88min
2025-12-04T12:28:09.7333913Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:28:09.7349231Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:28:09.7352421Z Running distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2025-12-04 12:28:09.735163][2290788.384347319]
2025-12-04T12:28:09.7352766Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:28:09.7354333Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_checkpoint_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:28:09.735344]
2025-12-04T12:28:13.1548096Z 
2025-12-04T12:28:13.1553063Z distributed/fsdp/test_checkpoint_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_checkpoint_wrapper_1.1_d80cb57983854b35_.log
2025-12-04T12:28:13.1556035Z Running 8 items in this shard: test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_apply_activation_checkpointing, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_args_kwargs, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_cpu_offload, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_kwarg_support, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_checkpoint_wrapper_parity, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_forward_missing_attributes, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_fqn, test/distributed/fsdp/test_checkpoint_wrapper.py::CheckpointWrapperTest::test_load_activation_checkpointed_module
2025-12-04T12:28:13.1558118Z 
2025-12-04T12:28:13.1558400Z Finished distributed/fsdp/test_checkpoint_wrapper 1/1 ... [2025-12-04 12:28:13.154469][2290791.803650169], took 0.06min
2025-12-04T12:28:13.1563452Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:28:13.1578765Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:28:13.1581758Z Running distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2025-12-04 12:28:13.158043][2290791.807226429]
2025-12-04T12:28:13.1582021Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:28:13.1583367Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_checkpoint.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:28:13.158211]
2025-12-04T12:31:08.7923162Z 
2025-12-04T12:31:08.7923922Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint 1/1 (test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_18dc4e01a7029ded_.log)
2025-12-04T12:31:08.7924705Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-23ed22c1e35acd9c.xml
2025-12-04T12:31:08.7926051Z ============================= test session starts ==============================
2025-12-04T12:31:08.7926421Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:31:08.7926732Z cachedir: .pytest_cache
2025-12-04T12:31:08.7927105Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:31:08.7927507Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:31:08.7927694Z configfile: pytest.ini
2025-12-04T12:31:08.7928108Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:31:08.7929275Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py)
2025-12-04T12:31:08.7930007Z   class TestModel(nn.Module):
2025-12-04T12:31:08.7930184Z collected 17 items
2025-12-04T12:31:08.7930373Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T12:31:08.7937267Z Running 17 items in this shard: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True, test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.7943964Z 
2025-12-04T12:31:08.7944654Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_False I1204 12:28:14.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452374
2025-12-04T12:31:08.7945598Z I1204 12:28:14.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452375
2025-12-04T12:31:08.7946172Z I1204 12:28:14.871000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452376
2025-12-04T12:31:08.7946739Z I1204 12:28:14.872000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452377
2025-12-04T12:31:08.7947550Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7948200Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7948377Z dist init r=0, world=4
2025-12-04T12:31:08.7948535Z dist init r=3, world=4
2025-12-04T12:31:08.7948689Z dist init r=2, world=4
2025-12-04T12:31:08.7948874Z dist init r=1, world=4
2025-12-04T12:31:08.7949028Z PASSED [8.4128s] [  5%]
2025-12-04T12:31:08.7949747Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_False_use_orig_params_True I1204 12:28:23.286000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 452707
2025-12-04T12:31:08.7950734Z I1204 12:28:23.287000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 452708
2025-12-04T12:31:08.7951306Z I1204 12:28:23.287000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 452709
2025-12-04T12:31:08.7951875Z I1204 12:28:23.288000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 452710
2025-12-04T12:31:08.7952682Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7953302Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7953475Z dist init r=0, world=4
2025-12-04T12:31:08.7953626Z dist init r=3, world=4
2025-12-04T12:31:08.7953776Z dist init r=1, world=4
2025-12-04T12:31:08.7953926Z dist init r=2, world=4
2025-12-04T12:31:08.7954080Z PASSED [8.2110s] [ 11%]
2025-12-04T12:31:08.7954798Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_False I1204 12:28:31.499000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453040
2025-12-04T12:31:08.7955736Z I1204 12:28:31.499000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453041
2025-12-04T12:31:08.7956308Z I1204 12:28:31.500000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453042
2025-12-04T12:31:08.7956875Z I1204 12:28:31.500000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453043
2025-12-04T12:31:08.7957672Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7958310Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7958482Z dist init r=0, world=4
2025-12-04T12:31:08.7958635Z dist init r=3, world=4
2025-12-04T12:31:08.7958786Z dist init r=1, world=4
2025-12-04T12:31:08.7958935Z dist init r=2, world=4
2025-12-04T12:31:08.7959086Z PASSED [8.3114s] [ 17%]
2025-12-04T12:31:08.7959852Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload0_offload_activations_True_use_orig_params_True I1204 12:28:39.812000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453373
2025-12-04T12:31:08.7960778Z I1204 12:28:39.812000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453374
2025-12-04T12:31:08.7961347Z I1204 12:28:39.813000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453375
2025-12-04T12:31:08.7961913Z I1204 12:28:39.813000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453376
2025-12-04T12:31:08.7962712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7963320Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7963491Z dist init r=0, world=4
2025-12-04T12:31:08.7963646Z dist init r=3, world=4
2025-12-04T12:31:08.7963797Z dist init r=1, world=4
2025-12-04T12:31:08.7963976Z dist init r=2, world=4
2025-12-04T12:31:08.7964127Z PASSED [8.2111s] [ 23%]
2025-12-04T12:31:08.7964843Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_False I1204 12:28:48.024000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 453706
2025-12-04T12:31:08.7965793Z I1204 12:28:48.025000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 453707
2025-12-04T12:31:08.7966358Z I1204 12:28:48.025000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 453708
2025-12-04T12:31:08.7966923Z I1204 12:28:48.026000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 453709
2025-12-04T12:31:08.7967722Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7968380Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7968551Z dist init r=0, world=4
2025-12-04T12:31:08.7968705Z dist init r=3, world=4
2025-12-04T12:31:08.7968857Z dist init r=1, world=4
2025-12-04T12:31:08.7969008Z dist init r=2, world=4
2025-12-04T12:31:08.7969159Z PASSED [8.6112s] [ 29%]
2025-12-04T12:31:08.7969876Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_False_use_orig_params_True I1204 12:28:56.637000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454039
2025-12-04T12:31:08.7970811Z I1204 12:28:56.637000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454040
2025-12-04T12:31:08.7971381Z I1204 12:28:56.638000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454041
2025-12-04T12:31:08.7971951Z I1204 12:28:56.639000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454042
2025-12-04T12:31:08.7972753Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7973359Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7973530Z dist init r=0, world=4
2025-12-04T12:31:08.7973681Z dist init r=3, world=4
2025-12-04T12:31:08.7973832Z dist init r=1, world=4
2025-12-04T12:31:08.7973982Z dist init r=2, world=4
2025-12-04T12:31:08.7974133Z PASSED [8.2112s] [ 35%]
2025-12-04T12:31:08.7974898Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_False I1204 12:29:04.849000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454372
2025-12-04T12:31:08.7975828Z I1204 12:29:04.850000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454373
2025-12-04T12:31:08.7976399Z I1204 12:29:04.851000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454374
2025-12-04T12:31:08.7976965Z I1204 12:29:04.851000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454375
2025-12-04T12:31:08.7977762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7978419Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7978590Z dist init r=0, world=4
2025-12-04T12:31:08.7978745Z dist init r=3, world=4
2025-12-04T12:31:08.7978897Z dist init r=1, world=4
2025-12-04T12:31:08.7979050Z dist init r=2, world=4
2025-12-04T12:31:08.7979221Z PASSED [8.2109s] [ 41%]
2025-12-04T12:31:08.7979933Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_basic_checkpoint_end_to_end_cpu_offload1_offload_activations_True_use_orig_params_True I1204 12:29:13.062000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 454705
2025-12-04T12:31:08.7980884Z I1204 12:29:13.063000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 454706
2025-12-04T12:31:08.7981449Z I1204 12:29:13.063000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 454707
2025-12-04T12:31:08.7982012Z I1204 12:29:13.064000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 454708
2025-12-04T12:31:08.7982811Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7983423Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7983594Z dist init r=0, world=4
2025-12-04T12:31:08.7983747Z dist init r=3, world=4
2025-12-04T12:31:08.7983897Z dist init r=2, world=4
2025-12-04T12:31:08.7984048Z dist init r=1, world=4
2025-12-04T12:31:08.7984200Z PASSED [8.3114s] [ 47%]
2025-12-04T12:31:08.7984914Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_False I1204 12:29:21.375000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455038
2025-12-04T12:31:08.7985847Z I1204 12:29:21.375000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455039
2025-12-04T12:31:08.7986415Z I1204 12:29:21.376000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455040
2025-12-04T12:31:08.7986982Z I1204 12:29:21.376000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455041
2025-12-04T12:31:08.7987785Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7988452Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7988623Z dist init r=0, world=4
2025-12-04T12:31:08.7988775Z dist init r=3, world=4
2025-12-04T12:31:08.7988924Z dist init r=2, world=4
2025-12-04T12:31:08.7989075Z dist init r=1, world=4
2025-12-04T12:31:08.7989228Z PASSED [8.4115s] [ 52%]
2025-12-04T12:31:08.7989988Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_False_use_orig_params_True I1204 12:29:29.787000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455371
2025-12-04T12:31:08.7990922Z I1204 12:29:29.788000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455372
2025-12-04T12:31:08.7991491Z I1204 12:29:29.789000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455373
2025-12-04T12:31:08.7992055Z I1204 12:29:29.789000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455374
2025-12-04T12:31:08.7992854Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7993467Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7993636Z dist init r=0, world=4
2025-12-04T12:31:08.7993789Z dist init r=3, world=4
2025-12-04T12:31:08.7993944Z dist init r=2, world=4
2025-12-04T12:31:08.7994096Z dist init r=1, world=4
2025-12-04T12:31:08.7994291Z PASSED [8.6121s] [ 58%]
2025-12-04T12:31:08.7995003Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_False I1204 12:29:38.401000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 455704
2025-12-04T12:31:08.7995950Z I1204 12:29:38.402000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 455705
2025-12-04T12:31:08.7996513Z I1204 12:29:38.402000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 455706
2025-12-04T12:31:08.7997073Z I1204 12:29:38.403000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 455707
2025-12-04T12:31:08.7997865Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.7998518Z   return func(*args, **kwargs)
2025-12-04T12:31:08.7998687Z dist init r=0, world=4
2025-12-04T12:31:08.7998836Z dist init r=3, world=4
2025-12-04T12:31:08.7998984Z dist init r=1, world=4
2025-12-04T12:31:08.7999131Z dist init r=2, world=4
2025-12-04T12:31:08.7999279Z PASSED [8.2116s] [ 64%]
2025-12-04T12:31:08.7999987Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload0_offload_activations_True_use_orig_params_True I1204 12:29:46.614000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456037
2025-12-04T12:31:08.8000913Z I1204 12:29:46.615000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456038
2025-12-04T12:31:08.8001477Z I1204 12:29:46.615000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456039
2025-12-04T12:31:08.8002042Z I1204 12:29:46.616000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456040
2025-12-04T12:31:08.8002837Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.8003443Z   return func(*args, **kwargs)
2025-12-04T12:31:08.8003611Z dist init r=0, world=4
2025-12-04T12:31:08.8003761Z dist init r=3, world=4
2025-12-04T12:31:08.8003909Z dist init r=1, world=4
2025-12-04T12:31:08.8004057Z dist init r=2, world=4
2025-12-04T12:31:08.8004206Z PASSED [8.5117s] [ 70%]
2025-12-04T12:31:08.8004965Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_False I1204 12:29:55.127000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456370
2025-12-04T12:31:08.8005895Z I1204 12:29:55.128000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456371
2025-12-04T12:31:08.8006458Z I1204 12:29:55.129000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456372
2025-12-04T12:31:08.8007022Z I1204 12:29:55.129000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456373
2025-12-04T12:31:08.8007816Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.8008451Z   return func(*args, **kwargs)
2025-12-04T12:31:08.8008619Z dist init r=3, world=4
2025-12-04T12:31:08.8008770Z dist init r=0, world=4
2025-12-04T12:31:08.8008922Z dist init r=1, world=4
2025-12-04T12:31:08.8009070Z dist init r=2, world=4
2025-12-04T12:31:08.8009219Z PASSED [8.8118s] [ 76%]
2025-12-04T12:31:08.8009960Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_False_use_orig_params_True I1204 12:30:03.941000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 456703
2025-12-04T12:31:08.8010902Z I1204 12:30:03.941000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 456704
2025-12-04T12:31:08.8011464Z I1204 12:30:03.942000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 456705
2025-12-04T12:31:08.8012025Z I1204 12:30:03.943000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 456706
2025-12-04T12:31:08.8012819Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.8013425Z   return func(*args, **kwargs)
2025-12-04T12:31:08.8013591Z dist init r=0, world=4
2025-12-04T12:31:08.8013740Z dist init r=3, world=4
2025-12-04T12:31:08.8013889Z dist init r=1, world=4
2025-12-04T12:31:08.8014038Z dist init r=2, world=4
2025-12-04T12:31:08.8014187Z PASSED [8.4122s] [ 82%]
2025-12-04T12:31:08.8014895Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_False I1204 12:30:12.354000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457036
2025-12-04T12:31:08.8015821Z I1204 12:30:12.355000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457037
2025-12-04T12:31:08.8016388Z I1204 12:30:12.356000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457038
2025-12-04T12:31:08.8016952Z I1204 12:30:12.356000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457039
2025-12-04T12:31:08.8017748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.8018388Z   return func(*args, **kwargs)
2025-12-04T12:31:08.8018555Z dist init r=0, world=4
2025-12-04T12:31:08.8018701Z dist init r=3, world=4
2025-12-04T12:31:08.8018849Z dist init r=1, world=4
2025-12-04T12:31:08.8018997Z dist init r=2, world=4
2025-12-04T12:31:08.8019145Z PASSED [8.5126s] [ 88%]
2025-12-04T12:31:08.8019928Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpoint::test_checkpoint_fsdp_wrapping_cpu_offload1_offload_activations_True_use_orig_params_True I1204 12:30:20.868000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457369
2025-12-04T12:31:08.8020862Z I1204 12:30:20.869000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457370
2025-12-04T12:31:08.8021426Z I1204 12:30:20.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457371
2025-12-04T12:31:08.8021990Z I1204 12:30:20.870000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457372
2025-12-04T12:31:08.8022783Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:31:08.8023390Z   return func(*args, **kwargs)
2025-12-04T12:31:08.8023558Z dist init r=0, world=4
2025-12-04T12:31:08.8023707Z dist init r=3, world=4
2025-12-04T12:31:08.8023856Z dist init r=2, world=4
2025-12-04T12:31:08.8024009Z dist init r=1, world=4
2025-12-04T12:31:08.8024161Z PASSED [8.3113s] [ 94%]
2025-12-04T12:31:08.8024851Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:30:29.182000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 457702
2025-12-04T12:31:08.8025750Z I1204 12:30:29.182000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 457703
2025-12-04T12:31:08.8026313Z I1204 12:30:29.183000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 457704
2025-12-04T12:31:08.8026873Z I1204 12:30:29.183000 452305 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 457705
2025-12-04T12:31:08.8027649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8028333Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8029354Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8030337Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8030947Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8031587Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8032599Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8033576Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8034179Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8034818Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8035500Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8036150Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8036798Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8037442Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8038086Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8038768Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8039403Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8040046Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8040711Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8041378Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8042021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8042656Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8043671Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8044646Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8045248Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8045881Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8046514Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8047155Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8047801Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8048480Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8049131Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8049768Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8050780Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8051797Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8052398Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8053035Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8053669Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8054312Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8054956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8055602Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8057994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8060497Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8062935Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8065345Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8067810Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8070252Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8072677Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8075101Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8075596Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8076160Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8076984Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8077792Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8078647Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8079398Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8080140Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8080920Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8081699Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8082480Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8083257Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8084014Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8084813Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8085593Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8086770Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664.
2025-12-04T12:31:08.8087876Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8088493Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8089546Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8090487Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8091092Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8091784Z [rank2]:E1204 12:30:36.208000 457704 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:31:08.8092178Z dist init r=2, world=4
2025-12-04T12:31:08.8092509Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8093067Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8093880Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8094684Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8095482Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8096230Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8096964Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8097743Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8098561Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8099336Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8100156Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8100910Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8101674Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8102455Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8103627Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016.
2025-12-04T12:31:08.8104732Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8105327Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8106412Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8107319Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8107926Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8108653Z [rank3]:E1204 12:30:36.214000 457705 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:31:08.8109050Z dist init r=3, world=4
2025-12-04T12:31:08.8109374Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8109932Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8110747Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8111554Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8112353Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8113099Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8113833Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8114606Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8115387Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8116201Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8116976Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8117732Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8118535Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8119312Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8120488Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216.
2025-12-04T12:31:08.8121625Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8122200Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8123255Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8124160Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8124763Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8125452Z [rank0]:E1204 12:30:36.277000 457702 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:31:08.8125845Z dist init r=0, world=4
2025-12-04T12:31:08.8126167Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8126724Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8127539Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8128379Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8129181Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8129928Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8130663Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8131488Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8132264Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8133041Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8133816Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8134569Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8135326Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8136120Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8137308Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880.
2025-12-04T12:31:08.8138453Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8139034Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8140076Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8140980Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8141581Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8142268Z [rank1]:E1204 12:30:36.283000 457703 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:31:08.8142665Z dist init r=1, world=4
2025-12-04T12:31:08.8143337Z [rank0]:[W1204 12:30:36.112743687 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:31:08.8144014Z FAILED [8.8121s] [100%]
2025-12-04T12:31:08.8144112Z 
2025-12-04T12:31:08.8144200Z =================================== FAILURES ===================================
2025-12-04T12:31:08.8144541Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _
2025-12-04T12:31:08.8144867Z Traceback (most recent call last):
2025-12-04T12:31:08.8145260Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:31:08.8145656Z     self._join_processes(fn)
2025-12-04T12:31:08.8146054Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:31:08.8146527Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:31:08.8146962Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:31:08.8147388Z     raise RuntimeError(error)
2025-12-04T12:31:08.8147624Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:31:08.8147879Z Traceback (most recent call last):
2025-12-04T12:31:08.8148304Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8148696Z     getattr(self, test_name)()
2025-12-04T12:31:08.8149070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8149450Z     fn()
2025-12-04T12:31:08.8149774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8150150Z     method(*args, **kwargs)
2025-12-04T12:31:08.8150509Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8150903Z     method(*args, **kwargs)
2025-12-04T12:31:08.8151254Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8151644Z     with policy():
2025-12-04T12:31:08.8151985Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8152362Z     raise RuntimeError(msg)
2025-12-04T12:31:08.8153104Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664.
2025-12-04T12:31:08.8153801Z 
2025-12-04T12:31:08.8153920Z To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8154540Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8155045Z 
2025-12-04T12:31:08.8155186Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8155388Z 
2025-12-04T12:31:08.8155390Z 
2025-12-04T12:31:08.8155513Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:31:08.8155833Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:31:08.8156458Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-23ed22c1e35acd9c.xml -
2025-12-04T12:31:08.8157033Z =========================== short test summary info ============================
2025-12-04T12:31:08.8157669Z FAILED [8.8121s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:31:08.8158308Z Traceback (most recent call last):
2025-12-04T12:31:08.8158705Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8159100Z     getattr(self, test_name)()
2025-12-04T12:31:08.8159475Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8159854Z     fn()
2025-12-04T12:31:08.8160177Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8160551Z     method(*args, **kwargs)
2025-12-04T12:31:08.8160947Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8161321Z     method(*args, **kwargs)
2025-12-04T12:31:08.8161675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8162044Z     with policy():
2025-12-04T12:31:08.8162384Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8162763Z     raise RuntimeError(msg)
2025-12-04T12:31:08.8163509Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664.
2025-12-04T12:31:08.8164206Z 
2025-12-04T12:31:08.8164321Z To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8164942Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8165500Z 
2025-12-04T12:31:08.8165639Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8165961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:31:08.8166223Z =================== 1 failed, 16 passed in 143.13s (0:02:23) ===================
2025-12-04T12:31:08.8166444Z Got exit code 1
2025-12-04T12:31:08.8166590Z Retrying single test...
2025-12-04T12:31:08.8167034Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-b6017cfe350bdc50.xml
2025-12-04T12:31:08.8167531Z ============================= test session starts ==============================
2025-12-04T12:31:08.8167866Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:31:08.8168211Z cachedir: .pytest_cache
2025-12-04T12:31:08.8168570Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:31:08.8168959Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:31:08.8169143Z configfile: pytest.ini
2025-12-04T12:31:08.8169506Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:31:08.8170393Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py)
2025-12-04T12:31:08.8171079Z   class TestModel(nn.Module):
2025-12-04T12:31:08.8171273Z collected 17 items / 16 deselected / 1 selected
2025-12-04T12:31:08.8171850Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8172404Z Running 1 items in this shard
2025-12-04T12:31:08.8172516Z 
2025-12-04T12:31:08.8173094Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:30:40.453000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458104
2025-12-04T12:31:08.8173990Z I1204 12:30:40.454000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458105
2025-12-04T12:31:08.8174558Z I1204 12:30:40.454000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458106
2025-12-04T12:31:08.8175123Z I1204 12:30:40.455000 458035 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458107
2025-12-04T12:31:08.8175943Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8176586Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8177608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8178632Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8179238Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8179881Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8180535Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8181214Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8181862Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8182504Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8183150Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8183790Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8184807Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8185784Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8186387Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8187025Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8187662Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8188333Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8188966Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8189611Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8190634Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8191644Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8192245Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8192891Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8193535Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8194170Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8194806Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8195447Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8196092Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8196755Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8197422Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8198057Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8199109Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8200084Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8200688Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8201328Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8201967Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8202614Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8203261Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8203906Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8206324Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8208768Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8211200Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8213637Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8216062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8218509Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8220942Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8223351Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8223845Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8224407Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8225257Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8226061Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8226862Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8227609Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8228396Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8229177Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8229977Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8230766Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8231539Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8232293Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8233053Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8233834Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8235014Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216.
2025-12-04T12:31:08.8236120Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8236701Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8237760Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8238715Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8239319Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8240008Z [rank0]:E1204 12:30:52.404000 458104 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:31:08.8240404Z dist init r=0, world=4
2025-12-04T12:31:08.8240772Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8241332Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8242148Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8242947Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8243747Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8244498Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8245256Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8246049Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8246824Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8247598Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8248404Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8249163Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8249929Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8250711Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8251888Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016.
2025-12-04T12:31:08.8252999Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8253578Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8254628Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8255532Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8256177Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8256865Z [rank3]:E1204 12:30:52.416000 458107 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:31:08.8257262Z dist init r=3, world=4
2025-12-04T12:31:08.8257586Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8258180Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8258993Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8259796Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8260595Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8261380Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8262113Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8262891Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8263671Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8264448Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8265222Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8265978Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8266735Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8267516Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8268717Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880.
2025-12-04T12:31:08.8269818Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8270394Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8271484Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8272387Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8272992Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8273684Z [rank1]:E1204 12:30:52.423000 458105 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:31:08.8274079Z dist init r=1, world=4
2025-12-04T12:31:08.8274400Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8274958Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8275773Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8276610Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8277411Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8278194Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8278928Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8279703Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8280476Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8281251Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8282024Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8282778Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8283537Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8284317Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8285490Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664.
2025-12-04T12:31:08.8286586Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8287198Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8288277Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8289177Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8289780Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8290468Z [rank2]:E1204 12:30:52.429000 458106 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:31:08.8290865Z dist init r=2, world=4
2025-12-04T12:31:08.8291523Z [rank0]:[W1204 12:30:52.247419805 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:31:08.8292241Z FAILED [13.7172s] [100%]
2025-12-04T12:31:08.8292345Z 
2025-12-04T12:31:08.8292429Z =================================== FAILURES ===================================
2025-12-04T12:31:08.8292768Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _
2025-12-04T12:31:08.8293094Z Traceback (most recent call last):
2025-12-04T12:31:08.8293485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:31:08.8293880Z     self._join_processes(fn)
2025-12-04T12:31:08.8294278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:31:08.8294707Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:31:08.8295147Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:31:08.8295574Z     raise RuntimeError(error)
2025-12-04T12:31:08.8295810Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:31:08.8296065Z Traceback (most recent call last):
2025-12-04T12:31:08.8296454Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8296846Z     getattr(self, test_name)()
2025-12-04T12:31:08.8297222Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8297600Z     fn()
2025-12-04T12:31:08.8297925Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8298337Z     method(*args, **kwargs)
2025-12-04T12:31:08.8298698Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8299072Z     method(*args, **kwargs)
2025-12-04T12:31:08.8299424Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8299793Z     with policy():
2025-12-04T12:31:08.8300133Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8300515Z     raise RuntimeError(msg)
2025-12-04T12:31:08.8301314Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216.
2025-12-04T12:31:08.8302011Z 
2025-12-04T12:31:08.8302127Z To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8302745Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8303252Z 
2025-12-04T12:31:08.8303391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8303595Z 
2025-12-04T12:31:08.8303597Z 
2025-12-04T12:31:08.8303718Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:31:08.8304036Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:31:08.8304657Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-b6017cfe350bdc50.xml -
2025-12-04T12:31:08.8305236Z =========================== short test summary info ============================
2025-12-04T12:31:08.8305883Z FAILED [13.7172s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:31:08.8306501Z Traceback (most recent call last):
2025-12-04T12:31:08.8306892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8307286Z     getattr(self, test_name)()
2025-12-04T12:31:08.8307659Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8308036Z     fn()
2025-12-04T12:31:08.8308399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8308774Z     method(*args, **kwargs)
2025-12-04T12:31:08.8309128Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8309503Z     method(*args, **kwargs)
2025-12-04T12:31:08.8309857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8310227Z     with policy():
2025-12-04T12:31:08.8310566Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8310943Z     raise RuntimeError(msg)
2025-12-04T12:31:08.8311688Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2459959296 and is now 3948937216.
2025-12-04T12:31:08.8312384Z 
2025-12-04T12:31:08.8312501Z To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8313122Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8313628Z 
2025-12-04T12:31:08.8313765Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8314062Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:31:08.8314321Z ====================== 1 failed, 16 deselected in 13.73s =======================
2025-12-04T12:31:08.8314537Z Got exit code 1
2025-12-04T12:31:08.8314684Z Retrying single test...
2025-12-04T12:31:08.8315129Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-ba679004c1dc5cc7.xml
2025-12-04T12:31:08.8315664Z ============================= test session starts ==============================
2025-12-04T12:31:08.8315994Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:31:08.8316295Z cachedir: .pytest_cache
2025-12-04T12:31:08.8316651Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:31:08.8317040Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:31:08.8317221Z configfile: pytest.ini
2025-12-04T12:31:08.8317583Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:31:08.8318508Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py)
2025-12-04T12:31:08.8319195Z   class TestModel(nn.Module):
2025-12-04T12:31:08.8319392Z collected 17 items / 16 deselected / 1 selected
2025-12-04T12:31:08.8319962Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8320542Z Running 1 items in this shard
2025-12-04T12:31:08.8320676Z 
2025-12-04T12:31:08.8321250Z distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda I1204 12:30:56.748000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458506
2025-12-04T12:31:08.8322134Z I1204 12:30:56.749000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458507
2025-12-04T12:31:08.8322701Z I1204 12:30:56.749000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 458508
2025-12-04T12:31:08.8323268Z I1204 12:30:56.750000 458437 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 458509
2025-12-04T12:31:08.8324038Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8324682Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8325702Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8326676Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8327282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8327922Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8328600Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8329247Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8329893Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8330536Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8331222Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8331862Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8332877Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8333849Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8334456Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8335095Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8335737Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8345806Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8346686Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8347345Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8348002Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8348698Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8349727Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8350728Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8351341Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8351982Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8352623Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8353274Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8353924Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8354572Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8355223Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:322: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8355863Z   model.checkpoint1 = FSDP(module=model.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8356931Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:31:08.8357910Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:31:08.8358542Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:323: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8359177Z   model.checkpoint2 = FSDP(module=model.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8359814Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:325: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8360458Z   model_ac.checkpoint1 = FSDP(module=model_ac.checkpoint1, **fsdp_kwargs)
2025-12-04T12:31:08.8361107Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:326: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:31:08.8361768Z   model_ac.checkpoint2 = FSDP(module=model_ac.checkpoint2, **fsdp_kwargs)
2025-12-04T12:31:08.8364132Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8366571Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8369052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8371459Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8373917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8376329Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8378800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:31:08.8381232Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:31:08.8381731Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8382297Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8383123Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8383929Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8384734Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8385486Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8386228Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8387013Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8387794Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8388606Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8389384Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8390756Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8391518Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8392307Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8393495Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016.
2025-12-04T12:31:08.8394608Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8395191Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8396274Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8397181Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8397789Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8398517Z [rank3]:E1204 12:31:03.912000 458509 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:31:08.8398917Z dist init r=3, world=4
2025-12-04T12:31:08.8399250Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8399812Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8400631Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8401435Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8402243Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8402995Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8403733Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8404512Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8405292Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8406116Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8406893Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8407651Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8408453Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8409235Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8410416Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 0. CUDA driver allocated memory was 2462056448 and is now 3948937216.
2025-12-04T12:31:08.8411555Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8412132Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8413182Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8414091Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8414697Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8415391Z [rank0]:E1204 12:31:03.954000 458506 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:31:08.8415791Z dist init r=0, world=4
2025-12-04T12:31:08.8416115Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8416676Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8417494Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8418334Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8419141Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8419892Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8420629Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8421407Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8422222Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8422998Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8423776Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8424533Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8425294Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8426076Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8427281Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 1. CUDA driver allocated memory was 2317352960 and is now 3806330880.
2025-12-04T12:31:08.8428462Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8429041Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8430093Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8430999Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8431604Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8432297Z [rank1]:E1204 12:31:03.959000 458507 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:31:08.8432692Z dist init r=1, world=4
2025-12-04T12:31:08.8433016Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:31:08.8433577Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:31:08.8434394Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8435203Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:31:08.8436011Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8436765Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:31:08.8437542Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8438360Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8439141Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8439919Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:31:08.8440699Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8441460Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:31:08.8442241Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8443039Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:31:08.8444221Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3789553664.
2025-12-04T12:31:08.8445325Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8445904Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8446963Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8447874Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:31:08.8448511Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8449204Z [rank2]:E1204 12:31:04.043000 458508 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:31:08.8449599Z dist init r=2, world=4
2025-12-04T12:31:08.8450260Z [rank0]:[W1204 12:31:04.801497155 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:31:08.8450945Z FAILED [8.9146s] [100%]
2025-12-04T12:31:08.8451043Z 
2025-12-04T12:31:08.8451134Z =================================== FAILURES ===================================
2025-12-04T12:31:08.8451478Z _ TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda _
2025-12-04T12:31:08.8451805Z Traceback (most recent call last):
2025-12-04T12:31:08.8452201Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:31:08.8452599Z     self._join_processes(fn)
2025-12-04T12:31:08.8453037Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:31:08.8453471Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:31:08.8453909Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:31:08.8454338Z     raise RuntimeError(error)
2025-12-04T12:31:08.8454578Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:31:08.8454834Z Traceback (most recent call last):
2025-12-04T12:31:08.8455222Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8455618Z     getattr(self, test_name)()
2025-12-04T12:31:08.8455993Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8456376Z     fn()
2025-12-04T12:31:08.8456700Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8457094Z     method(*args, **kwargs)
2025-12-04T12:31:08.8457452Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8457844Z     method(*args, **kwargs)
2025-12-04T12:31:08.8458237Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8458598Z     with policy():
2025-12-04T12:31:08.8458822Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8459054Z     raise RuntimeError(msg)
2025-12-04T12:31:08.8459505Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016.
2025-12-04T12:31:08.8459918Z 
2025-12-04T12:31:08.8459995Z To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8460373Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8460674Z 
2025-12-04T12:31:08.8460763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8460890Z 
2025-12-04T12:31:08.8460892Z 
2025-12-04T12:31:08.8460972Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:31:08.8461176Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:31:08.8461564Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-ba679004c1dc5cc7.xml -
2025-12-04T12:31:08.8461913Z =========================== short test summary info ============================
2025-12-04T12:31:08.8462295Z FAILED [8.9146s] distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:31:08.8462653Z Traceback (most recent call last):
2025-12-04T12:31:08.8462902Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:31:08.8463147Z     getattr(self, test_name)()
2025-12-04T12:31:08.8463384Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:31:08.8463620Z     fn()
2025-12-04T12:31:08.8463870Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8464100Z     method(*args, **kwargs)
2025-12-04T12:31:08.8464321Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:31:08.8464557Z     method(*args, **kwargs)
2025-12-04T12:31:08.8464781Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:31:08.8465013Z     with policy():
2025-12-04T12:31:08.8465234Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:31:08.8465471Z     raise RuntimeError(msg)
2025-12-04T12:31:08.8465930Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda! Caching allocator allocated memory was 512 and is now reported as 3236352 on device 3. CUDA driver allocated memory was 2250244096 and is now 3739222016.
2025-12-04T12:31:08.8466348Z 
2025-12-04T12:31:08.8466426Z To execute this test, run the following from the base repo dir:
2025-12-04T12:31:08.8466827Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_checkpoint.py TestFSDPCheckpointSubmoduleCUDA.test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8467146Z 
2025-12-04T12:31:08.8467236Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:31:08.8467429Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:31:08.8467602Z ======================= 1 failed, 16 deselected in 8.93s =======================
2025-12-04T12:31:08.8467748Z Got exit code 1
2025-12-04T12:31:08.8468023Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda
2025-12-04T12:31:08.8468447Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:31:08.8468836Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-8a208df4594f8f27.xml
2025-12-04T12:31:08.8469152Z ============================= test session starts ==============================
2025-12-04T12:31:08.8469373Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:31:08.8469568Z cachedir: .pytest_cache
2025-12-04T12:31:08.8469801Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:31:08.8470048Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:31:08.8470175Z configfile: pytest.ini
2025-12-04T12:31:08.8470410Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:31:08.8470955Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_checkpoint.py:292: PytestCollectionWarning: cannot collect test class 'TestModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_checkpoint.py)
2025-12-04T12:31:08.8471371Z   class TestModel(nn.Module):
2025-12-04T12:31:08.8471501Z collected 17 items / 17 deselected / 0 selected
2025-12-04T12:31:08.8471646Z stepcurrent: skipping 17 already run items.
2025-12-04T12:31:08.8471777Z Running 0 items in this shard
2025-12-04T12:31:08.8471850Z 
2025-12-04T12:31:08.8472112Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_checkpoint/distributed.fsdp.test_fsdp_checkpoint-8a208df4594f8f27.xml -
2025-12-04T12:31:08.8472463Z ============================ 17 deselected in 0.01s ============================
2025-12-04T12:31:08.8472834Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_checkpoint.py::TestFSDPCheckpointSubmoduleCUDA::test_checkpoint_submodule_use_reentrant_False_cuda']
2025-12-04T12:31:08.8473109Z 
2025-12-04T12:31:08.8473318Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_checkpoint 1/1 (test/test-reports/distributed.fsdp.test_fsdp_checkpoint_1.1_18dc4e01a7029ded_.log)
2025-12-04T12:31:08.8473557Z 
2025-12-04T12:31:08.8473697Z Finished distributed/fsdp/test_fsdp_checkpoint 1/1 ... [2025-12-04 12:31:08.792587][2290967.441765475], took 2.93min
2025-12-04T12:31:08.8474145Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:31:08.8474532Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:31:08.8474753Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:31:08.8474934Z Uploading artifacts took 0.00 seconds
2025-12-04T12:31:08.8475077Z distributed/fsdp/test_fsdp_checkpoint 1/1 failed!
2025-12-04T12:31:08.8475287Z Running distributed/fsdp/test_fsdp_fine_tune 1/1 ... [2025-12-04 12:31:08.796400][2290967.445583852]
2025-12-04T12:31:08.8475503Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:31:08.8475906Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_fine_tune.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:31:08.796573]
2025-12-04T12:33:30.8886288Z 
2025-12-04T12:33:30.8889587Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_fine_tune 1/1 (test/test-reports/distributed.fsdp.test_fsdp_fine_tune_1.1_f2107156872849a9_.log)
2025-12-04T12:33:30.8892005Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d1b74c890111edb9.xml
2025-12-04T12:33:30.8892381Z ============================= test session starts ==============================
2025-12-04T12:33:30.8892651Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.8892855Z cachedir: .pytest_cache
2025-12-04T12:33:30.8893085Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.8893337Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.8893456Z configfile: pytest.ini
2025-12-04T12:33:30.8893690Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.8893937Z collecting ... collected 4 items
2025-12-04T12:33:30.8894082Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T12:33:30.8894805Z Running 4 items in this shard: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda, test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.8895474Z 
2025-12-04T12:33:30.8895773Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:31:10.491000 458907 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 458976
2025-12-04T12:33:30.8896259Z I1204 12:31:10.491000 458907 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 458977
2025-12-04T12:33:30.8897559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.8898233Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.8898823Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.8899407Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.8899798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.8900168Z   return func(*args, **kwargs)
2025-12-04T12:33:30.8900537Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8900964Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.8901335Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8901761Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.8902116Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8902456Z   seq = FSDP(
2025-12-04T12:33:30.8902776Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8903114Z   seq = FSDP(
2025-12-04T12:33:30.8904451Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.8905894Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.8907368Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.8908861Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.8909173Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.8909519Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.8910020Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8910505Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.8910985Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8911473Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.8911916Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8912384Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8912857Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8913323Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8913787Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8914242Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.8914698Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8915166Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.8915818Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.8916430Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8916784Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8917391Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8917883Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8918311Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8918729Z [rank1]:E1204 12:31:17.622000 458977 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.8918972Z dist init r=1, world=2
2025-12-04T12:33:30.8919181Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.8919521Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.8920010Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8920505Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.8921004Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8921456Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.8921898Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8922365Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8922833Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8923301Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8923764Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8924218Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.8924675Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8925141Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.8925797Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576.
2025-12-04T12:33:30.8926405Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8926793Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8927364Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8927852Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8928260Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8928675Z [rank0]:E1204 12:31:17.684000 458976 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.8928917Z dist init r=0, world=2
2025-12-04T12:33:30.8929337Z [rank0]:[W1204 12:31:17.642245558 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.8929786Z FAILED [8.9124s] [ 25%]
2025-12-04T12:33:30.8929852Z 
2025-12-04T12:33:30.8929935Z =================================== FAILURES ===================================
2025-12-04T12:33:30.8930128Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________
2025-12-04T12:33:30.8930303Z Traceback (most recent call last):
2025-12-04T12:33:30.8930550Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.8930794Z     self._join_processes(fn)
2025-12-04T12:33:30.8931042Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.8931307Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.8931577Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.8931838Z     raise RuntimeError(error)
2025-12-04T12:33:30.8931991Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.8932155Z Traceback (most recent call last):
2025-12-04T12:33:30.8932396Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8932638Z     getattr(self, test_name)()
2025-12-04T12:33:30.8933122Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8933355Z     fn()
2025-12-04T12:33:30.8933560Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8933792Z     method(*args, **kwargs)
2025-12-04T12:33:30.8934017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8934248Z     method(*args, **kwargs)
2025-12-04T12:33:30.8934468Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8934696Z     with policy():
2025-12-04T12:33:30.8934912Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8935142Z     raise RuntimeError(msg)
2025-12-04T12:33:30.8935540Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.8935903Z 
2025-12-04T12:33:30.8935981Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8936342Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8936591Z 
2025-12-04T12:33:30.8936684Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8936810Z 
2025-12-04T12:33:30.8936812Z 
2025-12-04T12:33:30.8936894Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.8937098Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.8937476Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-d1b74c890111edb9.xml -
2025-12-04T12:33:30.8937823Z =========================== short test summary info ============================
2025-12-04T12:33:30.8938204Z FAILED [8.9124s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.8938535Z Traceback (most recent call last):
2025-12-04T12:33:30.8938784Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8939048Z     getattr(self, test_name)()
2025-12-04T12:33:30.8939281Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8939513Z     fn()
2025-12-04T12:33:30.8939715Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8939944Z     method(*args, **kwargs)
2025-12-04T12:33:30.8940165Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8940568Z     method(*args, **kwargs)
2025-12-04T12:33:30.8940791Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8941017Z     with policy():
2025-12-04T12:33:30.8941227Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8941459Z     raise RuntimeError(msg)
2025-12-04T12:33:30.8941855Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.8942219Z 
2025-12-04T12:33:30.8942294Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8942616Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8942864Z 
2025-12-04T12:33:30.8942952Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8943142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.8943302Z ============================== 1 failed in 8.92s ===============================
2025-12-04T12:33:30.8943436Z Got exit code 1
2025-12-04T12:33:30.8943532Z Retrying single test...
2025-12-04T12:33:30.8943805Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-133ef5108b952965.xml
2025-12-04T12:33:30.8944103Z ============================= test session starts ==============================
2025-12-04T12:33:30.8944316Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.8944504Z cachedir: .pytest_cache
2025-12-04T12:33:30.8944767Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.8945009Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.8945130Z configfile: pytest.ini
2025-12-04T12:33:30.8945359Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.8945631Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.8945942Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8946224Z Running 1 items in this shard
2025-12-04T12:33:30.8946299Z 
2025-12-04T12:33:30.8946595Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:31:21.924000 459143 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459212
2025-12-04T12:33:30.8947078Z I1204 12:31:21.924000 459143 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459213
2025-12-04T12:33:30.8947793Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.8948465Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.8949052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.8949634Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.8950024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.8950395Z   return func(*args, **kwargs)
2025-12-04T12:33:30.8950750Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8951115Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.8951470Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8951826Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.8952177Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8952517Z   seq = FSDP(
2025-12-04T12:33:30.8952836Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.8953170Z   seq = FSDP(
2025-12-04T12:33:30.8954538Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.8955965Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.8957406Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.8958892Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.8959199Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.8959545Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.8960040Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8960526Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.8961009Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8961465Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.8961912Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8962380Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8962847Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8963311Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8963773Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8964265Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.8964724Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8965192Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.8965846Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2021654528 and is now 3539992576.
2025-12-04T12:33:30.8966458Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8966809Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8967397Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8967899Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8968340Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8968758Z [rank0]:E1204 12:31:29.101000 459212 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.8969002Z dist init r=0, world=2
2025-12-04T12:33:30.8969206Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.8969547Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.8970068Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8970550Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.8971032Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8971480Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.8971923Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8972395Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8972859Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8973322Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.8973842Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8974298Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.8974757Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8975226Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.8975874Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.8976495Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8976867Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8977440Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8977930Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.8978351Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8978767Z [rank1]:E1204 12:31:29.192000 459213 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.8979010Z dist init r=1, world=2
2025-12-04T12:33:30.8979412Z [rank0]:[W1204 12:31:29.981792807 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.8979821Z FAILED [9.0119s] [100%]
2025-12-04T12:33:30.8979888Z 
2025-12-04T12:33:30.8979946Z =================================== FAILURES ===================================
2025-12-04T12:33:30.8980136Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________
2025-12-04T12:33:30.8980312Z Traceback (most recent call last):
2025-12-04T12:33:30.8980560Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.8980805Z     self._join_processes(fn)
2025-12-04T12:33:30.8981054Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.8981321Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.8981592Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.8981855Z     raise RuntimeError(error)
2025-12-04T12:33:30.8982008Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.8982171Z Traceback (most recent call last):
2025-12-04T12:33:30.8982412Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8982694Z     getattr(self, test_name)()
2025-12-04T12:33:30.8982927Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8983163Z     fn()
2025-12-04T12:33:30.8983366Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8983598Z     method(*args, **kwargs)
2025-12-04T12:33:30.8983820Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8984051Z     method(*args, **kwargs)
2025-12-04T12:33:30.8984270Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8984496Z     with policy():
2025-12-04T12:33:30.8984709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8984942Z     raise RuntimeError(msg)
2025-12-04T12:33:30.8985344Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2021654528 and is now 3539992576.
2025-12-04T12:33:30.8985740Z 
2025-12-04T12:33:30.8985816Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8986138Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8986386Z 
2025-12-04T12:33:30.8986476Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8986603Z 
2025-12-04T12:33:30.8986605Z 
2025-12-04T12:33:30.8986683Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.8986889Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.8987262Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-133ef5108b952965.xml -
2025-12-04T12:33:30.8987606Z =========================== short test summary info ============================
2025-12-04T12:33:30.8987939Z FAILED [9.0119s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.8988320Z Traceback (most recent call last):
2025-12-04T12:33:30.8988567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.8988811Z     getattr(self, test_name)()
2025-12-04T12:33:30.8989045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.8989279Z     fn()
2025-12-04T12:33:30.8989482Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8989713Z     method(*args, **kwargs)
2025-12-04T12:33:30.8989935Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.8990164Z     method(*args, **kwargs)
2025-12-04T12:33:30.8990383Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.8990608Z     with policy():
2025-12-04T12:33:30.8990817Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.8991049Z     raise RuntimeError(msg)
2025-12-04T12:33:30.8991488Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2021654528 and is now 3539992576.
2025-12-04T12:33:30.8991855Z 
2025-12-04T12:33:30.8991933Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.8992258Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8992502Z 
2025-12-04T12:33:30.8992592Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.8992781Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.8992949Z ======================= 1 failed, 3 deselected in 9.02s ========================
2025-12-04T12:33:30.8993087Z Got exit code 1
2025-12-04T12:33:30.8993185Z Retrying single test...
2025-12-04T12:33:30.8993459Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-cdaf34baac2ba9f9.xml
2025-12-04T12:33:30.8993774Z ============================= test session starts ==============================
2025-12-04T12:33:30.8993986Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.8994191Z cachedir: .pytest_cache
2025-12-04T12:33:30.8994414Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.8994653Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.8994774Z configfile: pytest.ini
2025-12-04T12:33:30.8995003Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.8995274Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.8995588Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.8995869Z Running 1 items in this shard
2025-12-04T12:33:30.8995944Z 
2025-12-04T12:33:30.8996241Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda I1204 12:31:33.520000 459379 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459448
2025-12-04T12:33:30.8996726Z I1204 12:31:33.521000 459379 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459449
2025-12-04T12:33:30.8997423Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.8998010Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.8998630Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.8999218Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.8999608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.8999976Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9000372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9000738Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9001095Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9001452Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9001798Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9002137Z   seq = FSDP(
2025-12-04T12:33:30.9002454Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:123: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9002789Z   seq = FSDP(
2025-12-04T12:33:30.9004121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9005580Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9007019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9008485Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9008794Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9009138Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9009633Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9010117Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9010631Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9011088Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9011532Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9012001Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9012472Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9012937Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9013416Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9013886Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9014348Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9014819Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9015474Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.9016083Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9016438Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9017012Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9017505Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9017876Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9018337Z [rank1]:E1204 12:31:40.483000 459449 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9018581Z dist init r=1, world=2
2025-12-04T12:33:30.9018789Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9019131Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9019653Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9020139Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9020620Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9021073Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9021514Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9021981Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9022446Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9022941Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9023409Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9023864Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9024323Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9024791Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9025438Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576.
2025-12-04T12:33:30.9026043Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9026395Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9026970Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9027459Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9027825Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9028289Z [rank0]:E1204 12:31:40.488000 459448 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9028532Z dist init r=0, world=2
2025-12-04T12:33:30.9028970Z [rank0]:[W1204 12:31:40.338955905 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9029386Z FAILED [8.7113s] [100%]
2025-12-04T12:33:30.9029453Z 
2025-12-04T12:33:30.9029511Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9029702Z ____________ TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda _____________
2025-12-04T12:33:30.9029878Z Traceback (most recent call last):
2025-12-04T12:33:30.9030125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9030369Z     self._join_processes(fn)
2025-12-04T12:33:30.9030616Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9030881Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9031154Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9031432Z     raise RuntimeError(error)
2025-12-04T12:33:30.9031586Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9031775Z Traceback (most recent call last):
2025-12-04T12:33:30.9032017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9032260Z     getattr(self, test_name)()
2025-12-04T12:33:30.9032493Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9032726Z     fn()
2025-12-04T12:33:30.9032937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9033169Z     method(*args, **kwargs)
2025-12-04T12:33:30.9033395Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9033625Z     method(*args, **kwargs)
2025-12-04T12:33:30.9033846Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9034074Z     with policy():
2025-12-04T12:33:30.9034287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9034518Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9034920Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576.
2025-12-04T12:33:30.9035283Z 
2025-12-04T12:33:30.9035358Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9035682Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9035931Z 
2025-12-04T12:33:30.9036020Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9036149Z 
2025-12-04T12:33:30.9036208Z Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9036350Z Traceback (most recent call last):
2025-12-04T12:33:30.9036594Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9036837Z     getattr(self, test_name)()
2025-12-04T12:33:30.9037071Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9046343Z     fn()
2025-12-04T12:33:30.9046583Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9046892Z     method(*args, **kwargs)
2025-12-04T12:33:30.9047122Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9047360Z     method(*args, **kwargs)
2025-12-04T12:33:30.9047587Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9047818Z     with policy():
2025-12-04T12:33:30.9048036Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9048323Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9048729Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.9049097Z 
2025-12-04T12:33:30.9049182Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9049508Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9049790Z 
2025-12-04T12:33:30.9049886Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9050015Z 
2025-12-04T12:33:30.9050016Z 
2025-12-04T12:33:30.9050103Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9050309Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9050691Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-cdaf34baac2ba9f9.xml -
2025-12-04T12:33:30.9051045Z =========================== short test summary info ============================
2025-12-04T12:33:30.9051391Z FAILED [8.7113s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9051715Z Traceback (most recent call last):
2025-12-04T12:33:30.9051968Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9052218Z     getattr(self, test_name)()
2025-12-04T12:33:30.9052459Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9052696Z     fn()
2025-12-04T12:33:30.9052906Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9053142Z     method(*args, **kwargs)
2025-12-04T12:33:30.9053364Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9053598Z     method(*args, **kwargs)
2025-12-04T12:33:30.9053819Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9054051Z     with policy():
2025-12-04T12:33:30.9054265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9054498Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9054902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 0. CUDA driver allocated memory was 2019557376 and is now 3539992576.
2025-12-04T12:33:30.9055267Z 
2025-12-04T12:33:30.9055342Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9055701Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9055952Z 
2025-12-04T12:33:30.9056045Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9056174Z 
2025-12-04T12:33:30.9056234Z Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9056382Z Traceback (most recent call last):
2025-12-04T12:33:30.9056628Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9056875Z     getattr(self, test_name)()
2025-12-04T12:33:30.9057111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9057347Z     fn()
2025-12-04T12:33:30.9057551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9057781Z     method(*args, **kwargs)
2025-12-04T12:33:30.9058010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9058290Z     method(*args, **kwargs)
2025-12-04T12:33:30.9058511Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9058759Z     with policy():
2025-12-04T12:33:30.9058973Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9059207Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9059607Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda! Caching allocator allocated memory was 512 and is now reported as 88064 on device 1. CUDA driver allocated memory was 1864368128 and is now 3384803328.
2025-12-04T12:33:30.9059973Z 
2025-12-04T12:33:30.9060048Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9060375Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9060623Z 
2025-12-04T12:33:30.9060715Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9060908Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9061077Z ======================= 1 failed, 3 deselected in 8.72s ========================
2025-12-04T12:33:30.9061220Z Got exit code 1
2025-12-04T12:33:30.9061441Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda
2025-12-04T12:33:30.9061767Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:33:30.9062142Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-62a07bc624719721.xml
2025-12-04T12:33:30.9062447Z ============================= test session starts ==============================
2025-12-04T12:33:30.9062670Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9062868Z cachedir: .pytest_cache
2025-12-04T12:33:30.9063098Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9063341Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9063466Z configfile: pytest.ini
2025-12-04T12:33:30.9063698Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9063973Z collecting ... collected 4 items / 1 deselected / 3 selected
2025-12-04T12:33:30.9064135Z stepcurrent: skipping 1 already run items.
2025-12-04T12:33:30.9064270Z Running 3 items in this shard
2025-12-04T12:33:30.9064344Z 
2025-12-04T12:33:30.9064678Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:31:44.481000 459615 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459684
2025-12-04T12:33:30.9065165Z I1204 12:31:44.482000 459615 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459685
2025-12-04T12:33:30.9065864Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9066458Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9067048Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9067661Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9068059Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9068472Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9068832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9069270Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9069632Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9069997Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9070346Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9070692Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9071017Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9071361Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9072711Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9074142Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9075610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9077043Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9077378Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9077741Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9078286Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9078775Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9079263Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9079723Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9080172Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9080644Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9081114Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9081589Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9082058Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9082517Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9082981Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9083451Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9084137Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632.
2025-12-04T12:33:30.9084754Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9085113Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9085690Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9086180Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9086553Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9086988Z [rank1]:E1204 12:31:52.745000 459685 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9087247Z dist init r=1, world=2
2025-12-04T12:33:30.9087456Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9087800Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9088334Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9088831Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9089315Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9089769Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9090211Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9090680Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9091149Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9091616Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9092082Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9092537Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9092996Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9093498Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9094151Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9094761Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9095113Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9095689Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9096194Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9096580Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9096995Z [rank0]:E1204 12:31:52.752000 459684 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9097238Z dist init r=0, world=2
2025-12-04T12:33:30.9097644Z [rank0]:[W1204 12:31:52.601311367 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9098061Z FAILED [10.0141s] [ 33%]
2025-12-04T12:33:30.9098132Z 
2025-12-04T12:33:30.9098245Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9098440Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________
2025-12-04T12:33:30.9098621Z Traceback (most recent call last):
2025-12-04T12:33:30.9098873Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9099122Z     self._join_processes(fn)
2025-12-04T12:33:30.9099374Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9099641Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9099916Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9100177Z     raise RuntimeError(error)
2025-12-04T12:33:30.9100332Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9100494Z Traceback (most recent call last):
2025-12-04T12:33:30.9100734Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9100975Z     getattr(self, test_name)()
2025-12-04T12:33:30.9101205Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9101435Z     fn()
2025-12-04T12:33:30.9101636Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9101865Z     method(*args, **kwargs)
2025-12-04T12:33:30.9102085Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9102313Z     method(*args, **kwargs)
2025-12-04T12:33:30.9102567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9102793Z     with policy():
2025-12-04T12:33:30.9103003Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9103233Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9103626Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632.
2025-12-04T12:33:30.9103986Z 
2025-12-04T12:33:30.9104063Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9104383Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9104626Z 
2025-12-04T12:33:30.9104722Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9104861Z 
2025-12-04T12:33:30.9104863Z 
2025-12-04T12:33:30.9104944Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9105163Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9105535Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-62a07bc624719721.xml -
2025-12-04T12:33:30.9105876Z =========================== short test summary info ============================
2025-12-04T12:33:30.9106207Z FAILED [10.0141s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9106517Z Traceback (most recent call last):
2025-12-04T12:33:30.9106768Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9107013Z     getattr(self, test_name)()
2025-12-04T12:33:30.9107246Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9107479Z     fn()
2025-12-04T12:33:30.9107679Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9107909Z     method(*args, **kwargs)
2025-12-04T12:33:30.9108127Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9108408Z     method(*args, **kwargs)
2025-12-04T12:33:30.9108628Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9108853Z     with policy():
2025-12-04T12:33:30.9109067Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9109298Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9109695Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632.
2025-12-04T12:33:30.9110061Z 
2025-12-04T12:33:30.9110135Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9110455Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9110698Z 
2025-12-04T12:33:30.9110787Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9111013Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9111181Z ======================= 1 failed, 1 deselected in 10.02s =======================
2025-12-04T12:33:30.9111320Z Got exit code 1
2025-12-04T12:33:30.9111417Z Retrying single test...
2025-12-04T12:33:30.9111687Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-fa7737fbb8bb2551.xml
2025-12-04T12:33:30.9111985Z ============================= test session starts ==============================
2025-12-04T12:33:30.9112198Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9112385Z cachedir: .pytest_cache
2025-12-04T12:33:30.9112609Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9112848Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9112968Z configfile: pytest.ini
2025-12-04T12:33:30.9113198Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9113487Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9113796Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9114094Z Running 1 items in this shard
2025-12-04T12:33:30.9114169Z 
2025-12-04T12:33:30.9114462Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:31:56.790000 459851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 459920
2025-12-04T12:33:30.9114941Z I1204 12:31:56.790000 459851 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 459921
2025-12-04T12:33:30.9115636Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9116228Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9116814Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9117401Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9117796Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9118212Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9118570Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9118931Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9119288Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9119644Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9119991Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9120360Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9120684Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9121022Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9122366Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9123825Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9125262Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9126677Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9126984Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9127328Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9127825Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9128354Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9128838Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9129289Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9129734Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9130232Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9130704Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9131170Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9131635Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9132090Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9132548Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9133028Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9133690Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9134295Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9134650Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9135224Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9135715Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9136083Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9136497Z [rank0]:E1204 12:32:04.936000 459920 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9136739Z dist init r=0, world=2
2025-12-04T12:33:30.9136944Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9137284Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9137773Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9138304Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9138783Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9139235Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9139713Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9140181Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9140647Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9141113Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9141580Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9142035Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9142506Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9142986Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9143629Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 30208 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632.
2025-12-04T12:33:30.9144232Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9144586Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9145159Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9145644Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9146009Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9146425Z [rank1]:E1204 12:32:04.937000 459921 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9146668Z dist init r=1, world=2
2025-12-04T12:33:30.9147070Z [rank0]:[W1204 12:32:05.787910829 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9147483Z FAILED [9.9121s] [100%]
2025-12-04T12:33:30.9147547Z 
2025-12-04T12:33:30.9147608Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9147798Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________
2025-12-04T12:33:30.9147974Z Traceback (most recent call last):
2025-12-04T12:33:30.9148359Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9148626Z     self._join_processes(fn)
2025-12-04T12:33:30.9148948Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9149228Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9149503Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9149763Z     raise RuntimeError(error)
2025-12-04T12:33:30.9149916Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9150080Z Traceback (most recent call last):
2025-12-04T12:33:30.9150324Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9150566Z     getattr(self, test_name)()
2025-12-04T12:33:30.9150799Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9151036Z     fn()
2025-12-04T12:33:30.9151238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9151487Z     method(*args, **kwargs)
2025-12-04T12:33:30.9151709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9151955Z     method(*args, **kwargs)
2025-12-04T12:33:30.9152175Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9152403Z     with policy():
2025-12-04T12:33:30.9152617Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9152850Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9153253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9153614Z 
2025-12-04T12:33:30.9153690Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9154012Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9154256Z 
2025-12-04T12:33:30.9154346Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9154470Z 
2025-12-04T12:33:30.9154472Z 
2025-12-04T12:33:30.9154554Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9154757Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9155135Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-fa7737fbb8bb2551.xml -
2025-12-04T12:33:30.9155480Z =========================== short test summary info ============================
2025-12-04T12:33:30.9155810Z FAILED [9.9121s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9156119Z Traceback (most recent call last):
2025-12-04T12:33:30.9156364Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9156608Z     getattr(self, test_name)()
2025-12-04T12:33:30.9156841Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9157075Z     fn()
2025-12-04T12:33:30.9157278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9157535Z     method(*args, **kwargs)
2025-12-04T12:33:30.9157761Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9157992Z     method(*args, **kwargs)
2025-12-04T12:33:30.9158255Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9158485Z     with policy():
2025-12-04T12:33:30.9158698Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9158928Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9159325Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9159686Z 
2025-12-04T12:33:30.9159766Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9160088Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9160378Z 
2025-12-04T12:33:30.9160466Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9160655Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9160821Z ======================= 1 failed, 3 deselected in 9.92s ========================
2025-12-04T12:33:30.9160957Z Got exit code 1
2025-12-04T12:33:30.9161054Z Retrying single test...
2025-12-04T12:33:30.9161320Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-236a181fc18f35dc.xml
2025-12-04T12:33:30.9161616Z ============================= test session starts ==============================
2025-12-04T12:33:30.9161828Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9162016Z cachedir: .pytest_cache
2025-12-04T12:33:30.9162238Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9162477Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9162595Z configfile: pytest.ini
2025-12-04T12:33:30.9162822Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9163091Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9163399Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9163678Z Running 1 items in this shard
2025-12-04T12:33:30.9163752Z 
2025-12-04T12:33:30.9164044Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda I1204 12:32:09.016000 460087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460156
2025-12-04T12:33:30.9164522Z I1204 12:32:09.017000 460087 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460157
2025-12-04T12:33:30.9165213Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9165799Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9166414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9167004Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9167403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9167770Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9168127Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9168532Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9168890Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9169262Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9169620Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9169959Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9170281Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:246: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9170615Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9171950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9173376Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9174811Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9176262Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9176567Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9176910Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9177401Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9177882Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9178402Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9178852Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9179310Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9179792Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9180259Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9180722Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9181186Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9181640Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9182094Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9182559Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9183203Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9183806Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9184157Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9184728Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9185212Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9185614Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9186030Z [rank0]:E1204 12:32:17.234000 460156 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9186272Z dist init r=0, world=2
2025-12-04T12:33:30.9186475Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9186813Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9187298Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9187776Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9188294Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9188773Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9189210Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9189673Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9190138Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9190602Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9191066Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9191519Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9191975Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9192443Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9193085Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 1. CUDA driver allocated memory was 1864368128 and is now 3388997632.
2025-12-04T12:33:30.9193686Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9194035Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9194636Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9195118Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9195489Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9195904Z [rank1]:E1204 12:32:17.243000 460157 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9196143Z dist init r=1, world=2
2025-12-04T12:33:30.9196544Z [rank0]:[W1204 12:32:17.089848995 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9196953Z FAILED [10.0126s] [100%]
2025-12-04T12:33:30.9197020Z 
2025-12-04T12:33:30.9197082Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9197282Z _____________ TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda _____________
2025-12-04T12:33:30.9197456Z Traceback (most recent call last):
2025-12-04T12:33:30.9197712Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9197954Z     self._join_processes(fn)
2025-12-04T12:33:30.9198238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9198501Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9198767Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9199024Z     raise RuntimeError(error)
2025-12-04T12:33:30.9199179Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9199340Z Traceback (most recent call last):
2025-12-04T12:33:30.9199580Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9199821Z     getattr(self, test_name)()
2025-12-04T12:33:30.9200053Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9200282Z     fn()
2025-12-04T12:33:30.9200484Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9200713Z     method(*args, **kwargs)
2025-12-04T12:33:30.9200932Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9201201Z     method(*args, **kwargs)
2025-12-04T12:33:30.9201425Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9201651Z     with policy():
2025-12-04T12:33:30.9201862Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9202094Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9202491Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9202850Z 
2025-12-04T12:33:30.9202926Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9203244Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9203487Z 
2025-12-04T12:33:30.9203614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9203740Z 
2025-12-04T12:33:30.9203744Z 
2025-12-04T12:33:30.9203821Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9204022Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9204392Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-236a181fc18f35dc.xml -
2025-12-04T12:33:30.9204737Z =========================== short test summary info ============================
2025-12-04T12:33:30.9205067Z FAILED [10.0126s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9205375Z Traceback (most recent call last):
2025-12-04T12:33:30.9205625Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9205866Z     getattr(self, test_name)()
2025-12-04T12:33:30.9206113Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9206359Z     fn()
2025-12-04T12:33:30.9206558Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9206786Z     method(*args, **kwargs)
2025-12-04T12:33:30.9207005Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9207233Z     method(*args, **kwargs)
2025-12-04T12:33:30.9207453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9207676Z     with policy():
2025-12-04T12:33:30.9207891Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9208121Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9208555Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda! Caching allocator allocated memory was 512 and is now reported as 29696 on device 0. CUDA driver allocated memory was 2019557376 and is now 3544186880.
2025-12-04T12:33:30.9208917Z 
2025-12-04T12:33:30.9208991Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9209308Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9209555Z 
2025-12-04T12:33:30.9209642Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9209828Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9209994Z ======================= 1 failed, 3 deselected in 10.02s =======================
2025-12-04T12:33:30.9210130Z Got exit code 1
2025-12-04T12:33:30.9210345Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda
2025-12-04T12:33:30.9210663Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:33:30.9211031Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-65ad0c2371b7284d.xml
2025-12-04T12:33:30.9211325Z ============================= test session starts ==============================
2025-12-04T12:33:30.9211535Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9211724Z cachedir: .pytest_cache
2025-12-04T12:33:30.9211947Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9212221Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9212339Z configfile: pytest.ini
2025-12-04T12:33:30.9212565Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9212834Z collecting ... collected 4 items / 2 deselected / 2 selected
2025-12-04T12:33:30.9212993Z stepcurrent: skipping 2 already run items.
2025-12-04T12:33:30.9213122Z Running 2 items in this shard
2025-12-04T12:33:30.9213195Z 
2025-12-04T12:33:30.9213478Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:32:21.285000 460323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460392
2025-12-04T12:33:30.9213944Z I1204 12:32:21.286000 460323 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460393
2025-12-04T12:33:30.9214636Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9215258Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9215848Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9216431Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9216821Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9217190Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9217544Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9217902Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9218294Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9218648Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9218991Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9219328Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9219649Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9219985Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9221376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9222807Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9224238Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9225679Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9225985Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9226329Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9226822Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9227307Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9227788Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9228322Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9228766Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9229233Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9229697Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9230164Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9230627Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9231082Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9231575Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9232042Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9232677Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9233269Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9233621Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9234181Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9234690Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9235056Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9235476Z [rank1]:E1204 12:32:28.392000 460393 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9235717Z dist init r=1, world=2
2025-12-04T12:33:30.9235926Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9236264Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9236751Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9237231Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9237710Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9238214Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9238655Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9239120Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9239584Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9240045Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9240539Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9240993Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9241452Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9241916Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9242547Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9243140Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9243504Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9244083Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9244560Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9244925Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9245339Z [rank0]:E1204 12:32:28.454000 460392 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9245578Z dist init r=0, world=2
2025-12-04T12:33:30.9245981Z [rank0]:[W1204 12:32:28.408349308 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9246391Z FAILED [8.8114s] [ 50%]
2025-12-04T12:33:30.9246456Z 
2025-12-04T12:33:30.9246516Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9246703Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________
2025-12-04T12:33:30.9246877Z Traceback (most recent call last):
2025-12-04T12:33:30.9247123Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9247371Z     self._join_processes(fn)
2025-12-04T12:33:30.9247619Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9247884Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9248189Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9248452Z     raise RuntimeError(error)
2025-12-04T12:33:30.9248606Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9248770Z Traceback (most recent call last):
2025-12-04T12:33:30.9249010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9249252Z     getattr(self, test_name)()
2025-12-04T12:33:30.9249522Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9249754Z     fn()
2025-12-04T12:33:30.9249957Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9250190Z     method(*args, **kwargs)
2025-12-04T12:33:30.9250411Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9250640Z     method(*args, **kwargs)
2025-12-04T12:33:30.9250858Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9251085Z     with policy():
2025-12-04T12:33:30.9251297Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9251526Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9251912Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9252277Z 
2025-12-04T12:33:30.9252353Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9252677Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9252915Z 
2025-12-04T12:33:30.9253003Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9253129Z 
2025-12-04T12:33:30.9253131Z 
2025-12-04T12:33:30.9253208Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9253409Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9253781Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-65ad0c2371b7284d.xml -
2025-12-04T12:33:30.9254122Z =========================== short test summary info ============================
2025-12-04T12:33:30.9254438Z FAILED [8.8114s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9254734Z Traceback (most recent call last):
2025-12-04T12:33:30.9254977Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9255220Z     getattr(self, test_name)()
2025-12-04T12:33:30.9255453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9255684Z     fn()
2025-12-04T12:33:30.9255883Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9256113Z     method(*args, **kwargs)
2025-12-04T12:33:30.9256333Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9256560Z     method(*args, **kwargs)
2025-12-04T12:33:30.9256779Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9257004Z     with policy():
2025-12-04T12:33:30.9257213Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9257444Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9257831Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9258221Z 
2025-12-04T12:33:30.9258324Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9258634Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9258870Z 
2025-12-04T12:33:30.9258961Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9259148Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9259313Z ======================= 1 failed, 2 deselected in 8.82s ========================
2025-12-04T12:33:30.9259450Z Got exit code 1
2025-12-04T12:33:30.9259545Z Retrying single test...
2025-12-04T12:33:30.9259813Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-809e0096656d718a.xml
2025-12-04T12:33:30.9260109Z ============================= test session starts ==============================
2025-12-04T12:33:30.9260321Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9260525Z cachedir: .pytest_cache
2025-12-04T12:33:30.9260747Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9261002Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9261119Z configfile: pytest.ini
2025-12-04T12:33:30.9261347Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9261619Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9261923Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda
2025-12-04T12:33:30.9262194Z Running 1 items in this shard
2025-12-04T12:33:30.9262266Z 
2025-12-04T12:33:30.9262551Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:32:32.668000 460559 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460628
2025-12-04T12:33:30.9263021Z I1204 12:32:32.669000 460559 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460629
2025-12-04T12:33:30.9263712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9264299Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9264882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9265468Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9265858Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9266224Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9266580Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9266938Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9267315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9267674Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9268021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9268410Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9268733Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9269073Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9270413Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9271877Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9273315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9274740Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9275045Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9275389Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9275880Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9276361Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9276873Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9277324Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9277767Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9278286Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9278753Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9279216Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9279683Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9280148Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9280621Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9281089Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9281728Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 18432 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9282321Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9282671Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9283233Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9283713Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9284080Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9284495Z [rank1]:E1204 12:32:39.714000 460629 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9284738Z dist init r=1, world=2
2025-12-04T12:33:30.9284942Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9285279Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9285765Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9286281Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9286763Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9287216Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9287656Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9288122Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9288614Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9289077Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9289557Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9290022Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9290478Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9290948Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9291583Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9292179Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9292531Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9293095Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9293570Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9293941Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9294357Z [rank0]:E1204 12:32:39.716000 460628 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9294602Z dist init r=0, world=2
2025-12-04T12:33:30.9295002Z [rank0]:[W1204 12:32:39.632374026 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9295046Z FAILED [9.0118s] [100%]
2025-12-04T12:33:30.9295048Z 
2025-12-04T12:33:30.9295145Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9295239Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________
2025-12-04T12:33:30.9295293Z Traceback (most recent call last):
2025-12-04T12:33:30.9295465Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9295513Z     self._join_processes(fn)
2025-12-04T12:33:30.9295696Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9295753Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9295941Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9295987Z     raise RuntimeError(error)
2025-12-04T12:33:30.9296076Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9296131Z Traceback (most recent call last):
2025-12-04T12:33:30.9296299Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9296357Z     getattr(self, test_name)()
2025-12-04T12:33:30.9296535Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9296574Z     fn()
2025-12-04T12:33:30.9296733Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9296777Z     method(*args, **kwargs)
2025-12-04T12:33:30.9296934Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9296978Z     method(*args, **kwargs)
2025-12-04T12:33:30.9297138Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9297180Z     with policy():
2025-12-04T12:33:30.9297340Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9297386Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9297711Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9297713Z 
2025-12-04T12:33:30.9297798Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9298001Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9298003Z 
2025-12-04T12:33:30.9298100Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9298104Z 
2025-12-04T12:33:30.9298106Z 
2025-12-04T12:33:30.9298226Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9302864Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9303134Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-809e0096656d718a.xml -
2025-12-04T12:33:30.9303200Z =========================== short test summary info ============================
2025-12-04T12:33:30.9303427Z FAILED [9.0118s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9303475Z Traceback (most recent call last):
2025-12-04T12:33:30.9303650Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9303752Z     getattr(self, test_name)()
2025-12-04T12:33:30.9303915Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9303956Z     fn()
2025-12-04T12:33:30.9304110Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9304156Z     method(*args, **kwargs)
2025-12-04T12:33:30.9304309Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9304356Z     method(*args, **kwargs)
2025-12-04T12:33:30.9304506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9304547Z     with policy():
2025-12-04T12:33:30.9304702Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9304753Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9305075Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16384 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9305121Z 
2025-12-04T12:33:30.9305204Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9305409Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9305412Z 
2025-12-04T12:33:30.9305505Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9305574Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9305639Z ======================= 1 failed, 3 deselected in 9.02s ========================
2025-12-04T12:33:30.9305681Z Got exit code 1
2025-12-04T12:33:30.9305723Z Retrying single test...
2025-12-04T12:33:30.9305932Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-03c4575fd3440fa8.xml
2025-12-04T12:33:30.9305994Z ============================= test session starts ==============================
2025-12-04T12:33:30.9306114Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9306156Z cachedir: .pytest_cache
2025-12-04T12:33:30.9306321Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9306372Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9306420Z configfile: pytest.ini
2025-12-04T12:33:30.9306586Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9306666Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9306864Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda
2025-12-04T12:33:30.9306916Z Running 1 items in this shard
2025-12-04T12:33:30.9306920Z 
2025-12-04T12:33:30.9307203Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda I1204 12:32:43.950000 460795 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 460864
2025-12-04T12:33:30.9307365Z I1204 12:32:43.951000 460795 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 460865
2025-12-04T12:33:30.9307894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9307961Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9308501Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9308564Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9308863Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9308911Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9309198Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9309283Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9309559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/wrap.py:91: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9309607Z   return fsdp_fn(module, **kwargs)
2025-12-04T12:33:30.9309876Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9309919Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9310185Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_fine_tune.py:298: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T12:33:30.9310228Z   fsdp_seq = FSDP(
2025-12-04T12:33:30.9311525Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9311658Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9312955Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:33:30.9313084Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:33:30.9313233Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9313402Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9313698Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9313861Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9314162Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9314307Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9314590Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9314743Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9315026Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9315177Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9315461Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9315603Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9315886Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9316041Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9316495Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9316617Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9316817Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9317171Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9317292Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9317508Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9317678Z [rank1]:E1204 12:32:50.876000 460865 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9317718Z dist init r=1, world=2
2025-12-04T12:33:30.9317862Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9318026Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9318355Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9318538Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9318826Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9318955Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9319236Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9319388Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9319666Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9319820Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9320097Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9320241Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9320522Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9320676Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9321123Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9321240Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9321467Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9321796Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9321915Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9322131Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9322297Z [rank0]:E1204 12:32:50.877000 460864 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9322340Z dist init r=0, world=2
2025-12-04T12:33:30.9322685Z [rank0]:[W1204 12:32:51.723339273 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9322752Z FAILED [8.7123s] [100%]
2025-12-04T12:33:30.9322754Z 
2025-12-04T12:33:30.9322813Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9322908Z ________________ TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda ________________
2025-12-04T12:33:30.9322956Z Traceback (most recent call last):
2025-12-04T12:33:30.9323123Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9323169Z     self._join_processes(fn)
2025-12-04T12:33:30.9323346Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9323402Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9323584Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9323630Z     raise RuntimeError(error)
2025-12-04T12:33:30.9323718Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9323764Z Traceback (most recent call last):
2025-12-04T12:33:30.9323929Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9323974Z     getattr(self, test_name)()
2025-12-04T12:33:30.9324138Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9324177Z     fn()
2025-12-04T12:33:30.9324330Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9324377Z     method(*args, **kwargs)
2025-12-04T12:33:30.9324529Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9324573Z     method(*args, **kwargs)
2025-12-04T12:33:30.9324727Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9324769Z     with policy():
2025-12-04T12:33:30.9324924Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9324969Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9325289Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9325291Z 
2025-12-04T12:33:30.9325396Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9325601Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9325604Z 
2025-12-04T12:33:30.9325696Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9325698Z 
2025-12-04T12:33:30.9325761Z Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9325807Z Traceback (most recent call last):
2025-12-04T12:33:30.9325975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9326018Z     getattr(self, test_name)()
2025-12-04T12:33:30.9326180Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9326216Z     fn()
2025-12-04T12:33:30.9326373Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9326432Z     method(*args, **kwargs)
2025-12-04T12:33:30.9326586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9326638Z     method(*args, **kwargs)
2025-12-04T12:33:30.9326794Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9326832Z     with policy():
2025-12-04T12:33:30.9326989Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9327030Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9327355Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9327357Z 
2025-12-04T12:33:30.9327435Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9327638Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9327641Z 
2025-12-04T12:33:30.9327733Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9327735Z 
2025-12-04T12:33:30.9327737Z 
2025-12-04T12:33:30.9327817Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9327909Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9328209Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-03c4575fd3440fa8.xml -
2025-12-04T12:33:30.9328277Z =========================== short test summary info ============================
2025-12-04T12:33:30.9328496Z FAILED [8.7123s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9328548Z Traceback (most recent call last):
2025-12-04T12:33:30.9328714Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9328760Z     getattr(self, test_name)()
2025-12-04T12:33:30.9328920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9328958Z     fn()
2025-12-04T12:33:30.9329110Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9329153Z     method(*args, **kwargs)
2025-12-04T12:33:30.9329336Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9329383Z     method(*args, **kwargs)
2025-12-04T12:33:30.9329536Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9329577Z     with policy():
2025-12-04T12:33:30.9329733Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9329775Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9330094Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 17408 on device 0. CUDA driver allocated memory was 2019557376 and is now 3525312512.
2025-12-04T12:33:30.9330096Z 
2025-12-04T12:33:30.9330171Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9330375Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9330390Z 
2025-12-04T12:33:30.9330479Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9330494Z 
2025-12-04T12:33:30.9330556Z Process 1 exited with error code 10 and exception:
2025-12-04T12:33:30.9330602Z Traceback (most recent call last):
2025-12-04T12:33:30.9330770Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9330813Z     getattr(self, test_name)()
2025-12-04T12:33:30.9330976Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9331011Z     fn()
2025-12-04T12:33:30.9331168Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9331209Z     method(*args, **kwargs)
2025-12-04T12:33:30.9331364Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9331408Z     method(*args, **kwargs)
2025-12-04T12:33:30.9331560Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9331599Z     with policy():
2025-12-04T12:33:30.9331752Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9331796Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9332112Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 1. CUDA driver allocated memory was 1864368128 and is now 3370123264.
2025-12-04T12:33:30.9332114Z 
2025-12-04T12:33:30.9332191Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9332390Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_ddp_cuda
2025-12-04T12:33:30.9332393Z 
2025-12-04T12:33:30.9332485Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9332550Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9332617Z ======================= 1 failed, 3 deselected in 8.72s ========================
2025-12-04T12:33:30.9332655Z Got exit code 1
2025-12-04T12:33:30.9332810Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda
2025-12-04T12:33:30.9332940Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:33:30.9333174Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-5b5895a03e1f67ac.xml
2025-12-04T12:33:30.9333240Z ============================= test session starts ==============================
2025-12-04T12:33:30.9333354Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9333400Z cachedir: .pytest_cache
2025-12-04T12:33:30.9333561Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9333611Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9333653Z configfile: pytest.ini
2025-12-04T12:33:30.9333822Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9333896Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9333955Z stepcurrent: skipping 3 already run items.
2025-12-04T12:33:30.9334004Z Running 1 items in this shard
2025-12-04T12:33:30.9334006Z 
2025-12-04T12:33:30.9334308Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:32:54.981000 461031 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461100
2025-12-04T12:33:30.9334485Z I1204 12:32:54.982000 461031 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461101
2025-12-04T12:33:30.9334986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9335052Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9335543Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9335609Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9335902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9335948Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9336095Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9336263Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9336557Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9336718Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9337007Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9337135Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9337437Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9337588Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9337870Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9338019Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9338337Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9338478Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9338776Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9338942Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9339409Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136.
2025-12-04T12:33:30.9339532Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9339729Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9340081Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9340200Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9340413Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9340584Z [rank1]:E1204 12:33:02.815000 461101 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9340624Z dist init r=1, world=2
2025-12-04T12:33:30.9340766Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9340928Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9341222Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9341378Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9341698Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9341829Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9342108Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9342260Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9342539Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9342689Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9342968Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9343129Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9343414Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9343564Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9344029Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9344147Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9344347Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9344696Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9344813Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9345030Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9345197Z [rank0]:E1204 12:33:02.821000 461100 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9345241Z dist init r=0, world=2
2025-12-04T12:33:30.9345580Z [rank0]:[W1204 12:33:03.666204771 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9345623Z FAILED [9.5118s] [100%]
2025-12-04T12:33:30.9345625Z 
2025-12-04T12:33:30.9345682Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9345781Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________
2025-12-04T12:33:30.9345848Z Traceback (most recent call last):
2025-12-04T12:33:30.9346015Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9346060Z     self._join_processes(fn)
2025-12-04T12:33:30.9346237Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9346293Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9346476Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9346523Z     raise RuntimeError(error)
2025-12-04T12:33:30.9346606Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9346654Z Traceback (most recent call last):
2025-12-04T12:33:30.9346817Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9346864Z     getattr(self, test_name)()
2025-12-04T12:33:30.9347023Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9347072Z     fn()
2025-12-04T12:33:30.9347236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9347278Z     method(*args, **kwargs)
2025-12-04T12:33:30.9347429Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9347472Z     method(*args, **kwargs)
2025-12-04T12:33:30.9347623Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9347666Z     with policy():
2025-12-04T12:33:30.9347818Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9347861Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9348226Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9348230Z 
2025-12-04T12:33:30.9348306Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9348522Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9348525Z 
2025-12-04T12:33:30.9348615Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9348617Z 
2025-12-04T12:33:30.9348618Z 
2025-12-04T12:33:30.9348696Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9348786Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9349035Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-5b5895a03e1f67ac.xml -
2025-12-04T12:33:30.9349097Z =========================== short test summary info ============================
2025-12-04T12:33:30.9349331Z FAILED [9.5118s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9349377Z Traceback (most recent call last):
2025-12-04T12:33:30.9349543Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9349585Z     getattr(self, test_name)()
2025-12-04T12:33:30.9349773Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9349807Z     fn()
2025-12-04T12:33:30.9349959Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9350001Z     method(*args, **kwargs)
2025-12-04T12:33:30.9350155Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9350195Z     method(*args, **kwargs)
2025-12-04T12:33:30.9350348Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9350384Z     with policy():
2025-12-04T12:33:30.9350537Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9350577Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9350912Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9350928Z 
2025-12-04T12:33:30.9351004Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9351238Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9351240Z 
2025-12-04T12:33:30.9351328Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9351391Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9351457Z ======================= 1 failed, 3 deselected in 9.52s ========================
2025-12-04T12:33:30.9351493Z Got exit code 1
2025-12-04T12:33:30.9351534Z Retrying single test...
2025-12-04T12:33:30.9351740Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-8465ced8e9a91520.xml
2025-12-04T12:33:30.9351801Z ============================= test session starts ==============================
2025-12-04T12:33:30.9351913Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9351957Z cachedir: .pytest_cache
2025-12-04T12:33:30.9352115Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9352162Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9352202Z configfile: pytest.ini
2025-12-04T12:33:30.9352366Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9352438Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9352652Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9352700Z Running 1 items in this shard
2025-12-04T12:33:30.9352702Z 
2025-12-04T12:33:30.9352995Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:33:06.768000 461267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461336
2025-12-04T12:33:30.9353151Z I1204 12:33:06.769000 461267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461337
2025-12-04T12:33:30.9353667Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9353730Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9354219Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9354281Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9354578Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9354625Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9354780Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9354947Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9355256Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9355425Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9355718Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9355849Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9356134Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9356293Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9356572Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9356726Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9357007Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9357153Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9357435Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9357591Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9358081Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9358228Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9358428Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9358775Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9358895Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9359110Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9359278Z [rank0]:E1204 12:33:14.522000 461336 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9359338Z dist init r=0, world=2
2025-12-04T12:33:30.9359480Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9359661Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9359952Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9360115Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9360405Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9360536Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9360816Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9360967Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9361248Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9361399Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9361680Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9361820Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9362102Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9362252Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9362739Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136.
2025-12-04T12:33:30.9362859Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9363056Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9363402Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9363519Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9363733Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9363922Z [rank1]:E1204 12:33:14.527000 461337 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9363962Z dist init r=1, world=2
2025-12-04T12:33:30.9364303Z [rank0]:[W1204 12:33:14.375356835 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9364343Z FAILED [9.4107s] [100%]
2025-12-04T12:33:30.9364345Z 
2025-12-04T12:33:30.9364405Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9364504Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________
2025-12-04T12:33:30.9364554Z Traceback (most recent call last):
2025-12-04T12:33:30.9364718Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9364766Z     self._join_processes(fn)
2025-12-04T12:33:30.9364940Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9364997Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9365176Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9365223Z     raise RuntimeError(error)
2025-12-04T12:33:30.9365304Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9365353Z Traceback (most recent call last):
2025-12-04T12:33:30.9365517Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9365562Z     getattr(self, test_name)()
2025-12-04T12:33:30.9365721Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9365759Z     fn()
2025-12-04T12:33:30.9365911Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9365955Z     method(*args, **kwargs)
2025-12-04T12:33:30.9366109Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9366151Z     method(*args, **kwargs)
2025-12-04T12:33:30.9366305Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9366342Z     with policy():
2025-12-04T12:33:30.9366524Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9366567Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9366902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9366905Z 
2025-12-04T12:33:30.9366981Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9367200Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9367203Z 
2025-12-04T12:33:30.9367292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9367294Z 
2025-12-04T12:33:30.9367296Z 
2025-12-04T12:33:30.9367376Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9367480Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9367727Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-8465ced8e9a91520.xml -
2025-12-04T12:33:30.9367801Z =========================== short test summary info ============================
2025-12-04T12:33:30.9368036Z FAILED [9.4107s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9368085Z Traceback (most recent call last):
2025-12-04T12:33:30.9368272Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9368317Z     getattr(self, test_name)()
2025-12-04T12:33:30.9368483Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9368524Z     fn()
2025-12-04T12:33:30.9368677Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9368721Z     method(*args, **kwargs)
2025-12-04T12:33:30.9368873Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9368916Z     method(*args, **kwargs)
2025-12-04T12:33:30.9369067Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9369110Z     with policy():
2025-12-04T12:33:30.9369264Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9369309Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9369644Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9369651Z 
2025-12-04T12:33:30.9369726Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9369947Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9369949Z 
2025-12-04T12:33:30.9370038Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9370106Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9370168Z ======================= 1 failed, 3 deselected in 9.42s ========================
2025-12-04T12:33:30.9370209Z Got exit code 1
2025-12-04T12:33:30.9370288Z Retrying single test...
2025-12-04T12:33:30.9370495Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-33077fddbd7467fc.xml
2025-12-04T12:33:30.9370555Z ============================= test session starts ==============================
2025-12-04T12:33:30.9370672Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9370715Z cachedir: .pytest_cache
2025-12-04T12:33:30.9370877Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9370924Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9370966Z configfile: pytest.ini
2025-12-04T12:33:30.9371133Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9371210Z collecting ... collected 4 items / 3 deselected / 1 selected
2025-12-04T12:33:30.9371424Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9371484Z Running 1 items in this shard
2025-12-04T12:33:30.9371507Z 
2025-12-04T12:33:30.9371802Z distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda I1204 12:33:18.512000 461503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461572
2025-12-04T12:33:30.9371960Z I1204 12:33:18.513000 461503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461573
2025-12-04T12:33:30.9372463Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9372526Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9373019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:33:30.9373083Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:33:30.9373376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T12:33:30.9373424Z   return func(*args, **kwargs)
2025-12-04T12:33:30.9373568Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9373734Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9374027Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9374186Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9374473Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9374624Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9374906Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9375057Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9375336Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9375484Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9375766Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9375914Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9376210Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9376363Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9376826Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 1. CUDA driver allocated memory was 1864368128 and is now 3340763136.
2025-12-04T12:33:30.9376950Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9377148Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9377497Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9377615Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9377831Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9378001Z [rank1]:E1204 12:33:26.224000 461573 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:33:30.9378042Z dist init r=1, world=2
2025-12-04T12:33:30.9378221Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:33:30.9378383Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:33:30.9378674Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9378854Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:33:30.9379146Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9379275Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:33:30.9379557Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9379709Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9379987Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9380138Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:33:30.9380428Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9380587Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:33:30.9380868Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9381021Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:33:30.9381484Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9381602Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9381802Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9382150Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9382267Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:33:30.9382480Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9382649Z [rank0]:E1204 12:33:26.230000 461572 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:33:30.9382690Z dist init r=0, world=2
2025-12-04T12:33:30.9383026Z [rank0]:[W1204 12:33:26.082744436 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:33:30.9383068Z FAILED [9.4126s] [100%]
2025-12-04T12:33:30.9383070Z 
2025-12-04T12:33:30.9383147Z =================================== FAILURES ===================================
2025-12-04T12:33:30.9383245Z __________ TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda __________
2025-12-04T12:33:30.9383293Z Traceback (most recent call last):
2025-12-04T12:33:30.9383461Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:33:30.9383505Z     self._join_processes(fn)
2025-12-04T12:33:30.9383682Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:33:30.9383737Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:33:30.9383921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:33:30.9383964Z     raise RuntimeError(error)
2025-12-04T12:33:30.9384050Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9384095Z Traceback (most recent call last):
2025-12-04T12:33:30.9384260Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9384316Z     getattr(self, test_name)()
2025-12-04T12:33:30.9384495Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9384530Z     fn()
2025-12-04T12:33:30.9384686Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9384729Z     method(*args, **kwargs)
2025-12-04T12:33:30.9384885Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9384929Z     method(*args, **kwargs)
2025-12-04T12:33:30.9385083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9385124Z     with policy():
2025-12-04T12:33:30.9385278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9385323Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9385655Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9385658Z 
2025-12-04T12:33:30.9385736Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9385954Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9385956Z 
2025-12-04T12:33:30.9386052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9386054Z 
2025-12-04T12:33:30.9386056Z 
2025-12-04T12:33:30.9386131Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:33:30.9386225Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:33:30.9386476Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-33077fddbd7467fc.xml -
2025-12-04T12:33:30.9386539Z =========================== short test summary info ============================
2025-12-04T12:33:30.9386775Z FAILED [9.4126s] distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:33:30.9386823Z Traceback (most recent call last):
2025-12-04T12:33:30.9387014Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:33:30.9387058Z     getattr(self, test_name)()
2025-12-04T12:33:30.9387224Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:33:30.9387260Z     fn()
2025-12-04T12:33:30.9387415Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9387455Z     method(*args, **kwargs)
2025-12-04T12:33:30.9387610Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:33:30.9387651Z     method(*args, **kwargs)
2025-12-04T12:33:30.9387804Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:33:30.9387842Z     with policy():
2025-12-04T12:33:30.9387999Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:33:30.9388042Z     raise RuntimeError(msg)
2025-12-04T12:33:30.9388435Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda! Caching allocator allocated memory was 512 and is now reported as 35328 on device 0. CUDA driver allocated memory was 2019557376 and is now 3495952384.
2025-12-04T12:33:30.9388452Z 
2025-12-04T12:33:30.9388530Z To execute this test, run the following from the base repo dir:
2025-12-04T12:33:30.9388747Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_fine_tune.py TestFSDPFineTuneCUDA.test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9388749Z 
2025-12-04T12:33:30.9388843Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:33:30.9388909Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:33:30.9388977Z ======================= 1 failed, 3 deselected in 9.42s ========================
2025-12-04T12:33:30.9389016Z Got exit code 1
2025-12-04T12:33:30.9389188Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda
2025-12-04T12:33:30.9389317Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:33:30.9389523Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e112a4a560e98aab.xml
2025-12-04T12:33:30.9389582Z ============================= test session starts ==============================
2025-12-04T12:33:30.9389697Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:33:30.9389740Z cachedir: .pytest_cache
2025-12-04T12:33:30.9389905Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:33:30.9389954Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:33:30.9390000Z configfile: pytest.ini
2025-12-04T12:33:30.9390164Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:33:30.9390242Z collecting ... collected 4 items / 4 deselected / 0 selected
2025-12-04T12:33:30.9390300Z stepcurrent: skipping 4 already run items.
2025-12-04T12:33:30.9390346Z Running 0 items in this shard
2025-12-04T12:33:30.9390348Z 
2025-12-04T12:33:30.9390599Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_fine_tune/distributed.fsdp.test_fsdp_fine_tune-e112a4a560e98aab.xml -
2025-12-04T12:33:30.9390660Z ============================ 4 deselected in 0.00s =============================
2025-12-04T12:33:30.9391297Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_backward_reshard_hooks_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_hooks_multi_traversal_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_ddp_cuda', 'test/distributed/fsdp/test_fsdp_fine_tune.py::TestFSDPFineTuneCUDA::test_parity_with_non_frozen_fsdp_cuda']
2025-12-04T12:33:30.9391301Z 
2025-12-04T12:33:30.9391498Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_fine_tune 1/1 (test/test-reports/distributed.fsdp.test_fsdp_fine_tune_1.1_f2107156872849a9_.log)
2025-12-04T12:33:30.9391505Z 
2025-12-04T12:33:30.9391636Z Finished distributed/fsdp/test_fsdp_fine_tune 1/1 ... [2025-12-04 12:33:30.889189][2291109.538370674], took 2.37min
2025-12-04T12:33:30.9391904Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:33:30.9391994Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:33:30.9392096Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:33:30.9392160Z Uploading artifacts took 0.00 seconds
2025-12-04T12:33:30.9392224Z distributed/fsdp/test_fsdp_fine_tune 1/1 failed!
2025-12-04T12:33:30.9392363Z Running distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 ... [2025-12-04 12:33:30.892410][2291109.541594542]
2025-12-04T12:33:30.9392418Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:33:30.9392757Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_dtensor_state_dict.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:33:30.892589]
2025-12-04T12:42:03.7282893Z 
2025-12-04T12:42:03.7283756Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_fsdp_dtensor_state_dict_1.1_429921b2f227c24a_.log)
2025-12-04T12:42:03.7284754Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-129d46d21b0c8aeb.xml
2025-12-04T12:42:03.7285385Z ============================= test session starts ==============================
2025-12-04T12:42:03.7285809Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7286177Z cachedir: .pytest_cache
2025-12-04T12:42:03.7286617Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7287138Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7287394Z configfile: pytest.ini
2025-12-04T12:42:03.7287868Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7289045Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7289931Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7290750Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7291576Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7291898Z collected 15 items
2025-12-04T12:42:03.7292176Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T12:42:03.7299207Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda, test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.7304691Z 
2025-12-04T12:42:03.7305174Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:33:32.608000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 461876
2025-12-04T12:42:03.7305887Z I1204 12:33:32.608000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 461877
2025-12-04T12:42:03.7306293Z I1204 12:33:32.609000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 461878
2025-12-04T12:42:03.7306695Z I1204 12:33:32.610000 461807 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 461879
2025-12-04T12:42:03.7307796Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7308754Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7309713Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7310594Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7311461Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7312331Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7313088Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7313872Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7315646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7317100Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7318624Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7320048Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7321531Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7322949Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7324383Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7325857Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7326227Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7326602Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7327138Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7327608Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7328075Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7328550Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7328996Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7329519Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7329977Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7330483Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7330994Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7331560Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7332060Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7332515Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7333259Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 952107008 and is now 2843738112.
2025-12-04T12:42:03.7333990Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7334398Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7335090Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7335722Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7336140Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7336610Z E1204 12:33:40.104000 461879 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7336994Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7337334Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7337815Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7338386Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7355482Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7356069Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7356510Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7356970Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7357475Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7357932Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7358462Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7358911Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7359364Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7359826Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7360579Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7361336Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7361682Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7362377Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7362989Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7363350Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7363755Z E1204 12:33:40.123000 461876 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7364096Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7364431Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7364913Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7365382Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7365845Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7366277Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7366702Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7367189Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7367640Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7368087Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7368577Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7369013Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7369454Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7369905Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7370652Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7371359Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7371695Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7372375Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7372975Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7373321Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7373716Z E1204 12:33:40.133000 461877 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7374039Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7374361Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7374831Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7375295Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7375756Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7376185Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7376639Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7377087Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7377535Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7377979Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7378481Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7378918Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7379359Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7379840Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7380569Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7381261Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7381596Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7382279Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7382879Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7383227Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7383629Z E1204 12:33:40.177000 461878 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7383864Z FAILED [8.8170s] [  6%]
2025-12-04T12:42:03.7383935Z 
2025-12-04T12:42:03.7383993Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7384279Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.7384548Z Traceback (most recent call last):
2025-12-04T12:42:03.7384799Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7385045Z     self._join_processes(fn)
2025-12-04T12:42:03.7385295Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7385559Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7385867Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7386130Z     raise RuntimeError(error)
2025-12-04T12:42:03.7386283Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7386445Z Traceback (most recent call last):
2025-12-04T12:42:03.7386687Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7386932Z     getattr(self, test_name)()
2025-12-04T12:42:03.7387163Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7387395Z     fn()
2025-12-04T12:42:03.7387597Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7387830Z     method(*args, **kwargs)
2025-12-04T12:42:03.7388054Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7388336Z     method(*args, **kwargs)
2025-12-04T12:42:03.7388572Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7388820Z     with policy():
2025-12-04T12:42:03.7389035Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7389267Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7389775Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7390247Z 
2025-12-04T12:42:03.7390322Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7390777Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7391152Z 
2025-12-04T12:42:03.7391244Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7391368Z 
2025-12-04T12:42:03.7391429Z Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.7391572Z Traceback (most recent call last):
2025-12-04T12:42:03.7391815Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7392057Z     getattr(self, test_name)()
2025-12-04T12:42:03.7392289Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7392521Z     fn()
2025-12-04T12:42:03.7392726Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7392957Z     method(*args, **kwargs)
2025-12-04T12:42:03.7393176Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7393405Z     method(*args, **kwargs)
2025-12-04T12:42:03.7393622Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7393845Z     with policy():
2025-12-04T12:42:03.7394057Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7394290Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7394836Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7395302Z 
2025-12-04T12:42:03.7395379Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7395828Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7396205Z 
2025-12-04T12:42:03.7396292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7396418Z 
2025-12-04T12:42:03.7396476Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7396619Z Traceback (most recent call last):
2025-12-04T12:42:03.7396862Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7397105Z     getattr(self, test_name)()
2025-12-04T12:42:03.7397349Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7397597Z     fn()
2025-12-04T12:42:03.7397797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7398028Z     method(*args, **kwargs)
2025-12-04T12:42:03.7398286Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7398519Z     method(*args, **kwargs)
2025-12-04T12:42:03.7398739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7398966Z     with policy():
2025-12-04T12:42:03.7399181Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7399411Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7399918Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 952107008 and is now 2843738112.
2025-12-04T12:42:03.7400386Z 
2025-12-04T12:42:03.7400461Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7400909Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7401283Z 
2025-12-04T12:42:03.7401371Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7401501Z 
2025-12-04T12:42:03.7401503Z 
2025-12-04T12:42:03.7401583Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7401791Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7402191Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-129d46d21b0c8aeb.xml -
2025-12-04T12:42:03.7402562Z =========================== short test summary info ============================
2025-12-04T12:42:03.7403011Z FAILED [8.8170s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7403440Z Traceback (most recent call last):
2025-12-04T12:42:03.7403720Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7403967Z     getattr(self, test_name)()
2025-12-04T12:42:03.7404200Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7404436Z     fn()
2025-12-04T12:42:03.7404637Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7404870Z     method(*args, **kwargs)
2025-12-04T12:42:03.7405090Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7405319Z     method(*args, **kwargs)
2025-12-04T12:42:03.7405538Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7405765Z     with policy():
2025-12-04T12:42:03.7405980Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7406234Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7406737Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7407219Z 
2025-12-04T12:42:03.7407296Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7407745Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7408121Z 
2025-12-04T12:42:03.7408248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7408374Z 
2025-12-04T12:42:03.7408434Z Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.7408574Z Traceback (most recent call last):
2025-12-04T12:42:03.7408858Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7409106Z     getattr(self, test_name)()
2025-12-04T12:42:03.7409338Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7409573Z     fn()
2025-12-04T12:42:03.7409774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7410008Z     method(*args, **kwargs)
2025-12-04T12:42:03.7410226Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7410457Z     method(*args, **kwargs)
2025-12-04T12:42:03.7410678Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7410907Z     with policy():
2025-12-04T12:42:03.7411118Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7411349Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7411849Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7412316Z 
2025-12-04T12:42:03.7412391Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7412894Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7413270Z 
2025-12-04T12:42:03.7413367Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7413498Z 
2025-12-04T12:42:03.7413559Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7413710Z Traceback (most recent call last):
2025-12-04T12:42:03.7413965Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7414218Z     getattr(self, test_name)()
2025-12-04T12:42:03.7414453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7414698Z     fn()
2025-12-04T12:42:03.7414913Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7415167Z     method(*args, **kwargs)
2025-12-04T12:42:03.7415392Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7415639Z     method(*args, **kwargs)
2025-12-04T12:42:03.7415857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7416085Z     with policy():
2025-12-04T12:42:03.7416302Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7416543Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7417061Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 952107008 and is now 2843738112.
2025-12-04T12:42:03.7417535Z 
2025-12-04T12:42:03.7417612Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7418069Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7418485Z 
2025-12-04T12:42:03.7418576Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7418773Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7418942Z ============================== 1 failed in 8.96s ===============================
2025-12-04T12:42:03.7419084Z Got exit code 1
2025-12-04T12:42:03.7419195Z Retrying single test...
2025-12-04T12:42:03.7419502Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a85052cd503004cf.xml
2025-12-04T12:42:03.7419943Z ============================= test session starts ==============================
2025-12-04T12:42:03.7420168Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7420369Z cachedir: .pytest_cache
2025-12-04T12:42:03.7420604Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7420853Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7420982Z configfile: pytest.ini
2025-12-04T12:42:03.7421223Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7421826Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7422277Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7422717Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7423174Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7423334Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7423767Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7424184Z Running 1 items in this shard
2025-12-04T12:42:03.7424263Z 
2025-12-04T12:42:03.7424685Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:33:43.922000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 462278
2025-12-04T12:42:03.7425325Z I1204 12:33:43.922000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 462279
2025-12-04T12:42:03.7425681Z I1204 12:33:43.923000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 462280
2025-12-04T12:42:03.7426034Z I1204 12:33:43.924000 462209 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 462281
2025-12-04T12:42:03.7426920Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7427682Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7428468Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7429242Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7429994Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7430739Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7431506Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7432247Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7433609Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7435057Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7436490Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7437931Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7439417Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7440841Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7442309Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7443743Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7444039Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7444371Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7444860Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7445350Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7445841Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7446276Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7446709Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7447174Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7447628Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7448078Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7448563Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7449013Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7449461Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7449920Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7450657Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7451362Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7451742Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7452444Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7453055Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7453416Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7453832Z E1204 12:33:51.302000 462278 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7454169Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7454499Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7454991Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7455494Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7455969Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7456410Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7456844Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7457297Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7457749Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7458245Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7458705Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7459152Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7459634Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7460097Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7460839Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7461576Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7461931Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7462627Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7463238Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7463598Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7464012Z E1204 12:33:51.329000 462279 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7464363Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7464694Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7465192Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7465662Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7466135Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7466580Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7467016Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7467475Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7467939Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7468435Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7468904Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7469341Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7469784Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7470236Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7471006Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2843738112.
2025-12-04T12:42:03.7471710Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7472055Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7472750Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7473363Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7473715Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7474136Z E1204 12:33:51.354000 462281 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7474491Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7474826Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7475309Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7475783Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7476256Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7476701Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7477135Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7477591Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7478053Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7478550Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7479012Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7479457Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7479905Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7480365Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7481135Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7481844Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7482188Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7482882Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7483497Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7483874Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7484300Z E1204 12:33:51.362000 462280 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7484550Z FAILED [8.5152s] [100%]
2025-12-04T12:42:03.7484619Z 
2025-12-04T12:42:03.7484686Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7484977Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.7485250Z Traceback (most recent call last):
2025-12-04T12:42:03.7485501Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7485746Z     self._join_processes(fn)
2025-12-04T12:42:03.7486004Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7486277Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7486555Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7486826Z     raise RuntimeError(error)
2025-12-04T12:42:03.7486988Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7487162Z Traceback (most recent call last):
2025-12-04T12:42:03.7487411Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7487664Z     getattr(self, test_name)()
2025-12-04T12:42:03.7487910Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7488184Z     fn()
2025-12-04T12:42:03.7488399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7488639Z     method(*args, **kwargs)
2025-12-04T12:42:03.7488874Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7489116Z     method(*args, **kwargs)
2025-12-04T12:42:03.7489350Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7489588Z     with policy():
2025-12-04T12:42:03.7489811Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7490055Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7490594Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7491063Z 
2025-12-04T12:42:03.7491138Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7491588Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7491965Z 
2025-12-04T12:42:03.7492054Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7492181Z 
2025-12-04T12:42:03.7492183Z 
2025-12-04T12:42:03.7492264Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7492468Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7492886Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a85052cd503004cf.xml -
2025-12-04T12:42:03.7493267Z =========================== short test summary info ============================
2025-12-04T12:42:03.7493717Z FAILED [8.5152s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7494143Z Traceback (most recent call last):
2025-12-04T12:42:03.7494390Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7494634Z     getattr(self, test_name)()
2025-12-04T12:42:03.7494867Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7495101Z     fn()
2025-12-04T12:42:03.7495305Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7495534Z     method(*args, **kwargs)
2025-12-04T12:42:03.7495756Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7495984Z     method(*args, **kwargs)
2025-12-04T12:42:03.7496209Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7496439Z     with policy():
2025-12-04T12:42:03.7496658Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7496897Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7497409Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7497936Z 
2025-12-04T12:42:03.7498014Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7498505Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7498884Z 
2025-12-04T12:42:03.7498972Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7499195Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7499365Z ======================= 1 failed, 14 deselected in 8.65s =======================
2025-12-04T12:42:03.7499508Z Got exit code 1
2025-12-04T12:42:03.7499607Z Retrying single test...
2025-12-04T12:42:03.7499904Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-fd866558b38d3026.xml
2025-12-04T12:42:03.7500229Z ============================= test session starts ==============================
2025-12-04T12:42:03.7500444Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7500636Z cachedir: .pytest_cache
2025-12-04T12:42:03.7500860Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7501102Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7501228Z configfile: pytest.ini
2025-12-04T12:42:03.7501455Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7502721Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7503179Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7503608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7504044Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7504191Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7504614Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7505024Z Running 1 items in this shard
2025-12-04T12:42:03.7505099Z 
2025-12-04T12:42:03.7505512Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:33:55.026000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 462680
2025-12-04T12:42:03.7506113Z I1204 12:33:55.027000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 462681
2025-12-04T12:42:03.7506456Z I1204 12:33:55.028000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 462682
2025-12-04T12:42:03.7506800Z I1204 12:33:55.028000 462611 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 462683
2025-12-04T12:42:03.7507672Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7508454Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7509226Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7510007Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7510752Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7511493Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7512232Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7513004Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7514343Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7515880Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7517317Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7518776Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7520231Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7521651Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7523071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7524514Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7524811Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7525139Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7525615Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7526086Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7526555Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7526988Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7527412Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7527865Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7528358Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7528807Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7529292Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7529729Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7530171Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7530620Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7531365Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7532063Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7532414Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7533118Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7533720Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7534072Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7534470Z E1204 12:34:02.430000 462680 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7534798Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7535122Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7535592Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7536054Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7536521Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7536954Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7537381Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7537831Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7538323Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7538799Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7539247Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7539682Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7540122Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7540570Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7541308Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7542028Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7542363Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7543049Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7543654Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7544003Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7544403Z E1204 12:34:02.446000 462681 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7544728Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7545049Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7545522Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7545984Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7546451Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7546886Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7547310Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7547755Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7548276Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7548726Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7549173Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7549607Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7550045Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7550497Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7551233Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7551952Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7552288Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7552974Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7553654Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7554005Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7554403Z E1204 12:34:02.457000 462682 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7554729Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7555051Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7555523Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7555986Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7556449Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7556882Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7557355Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7557834Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7558319Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7558766Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7559213Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7559651Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7560093Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7560597Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7561346Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112.
2025-12-04T12:42:03.7562043Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7562383Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7563066Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7563671Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7564019Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7564417Z E1204 12:34:02.489000 462683 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7564652Z FAILED [8.5150s] [100%]
2025-12-04T12:42:03.7564716Z 
2025-12-04T12:42:03.7564777Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7565061Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.7565339Z Traceback (most recent call last):
2025-12-04T12:42:03.7565590Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7565836Z     self._join_processes(fn)
2025-12-04T12:42:03.7566084Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7566352Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7566626Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7566891Z     raise RuntimeError(error)
2025-12-04T12:42:03.7567088Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7567263Z Traceback (most recent call last):
2025-12-04T12:42:03.7567512Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7567765Z     getattr(self, test_name)()
2025-12-04T12:42:03.7568007Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7568275Z     fn()
2025-12-04T12:42:03.7568491Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7568724Z     method(*args, **kwargs)
2025-12-04T12:42:03.7568948Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7569182Z     method(*args, **kwargs)
2025-12-04T12:42:03.7569408Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7569635Z     with policy():
2025-12-04T12:42:03.7569870Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7570123Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7570628Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7571095Z 
2025-12-04T12:42:03.7571173Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7571628Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7572013Z 
2025-12-04T12:42:03.7572105Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7572236Z 
2025-12-04T12:42:03.7572239Z 
2025-12-04T12:42:03.7572320Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7572529Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7572932Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-fd866558b38d3026.xml -
2025-12-04T12:42:03.7573309Z =========================== short test summary info ============================
2025-12-04T12:42:03.7573765Z FAILED [8.5150s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7574202Z Traceback (most recent call last):
2025-12-04T12:42:03.7574454Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7574708Z     getattr(self, test_name)()
2025-12-04T12:42:03.7574947Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7575186Z     fn()
2025-12-04T12:42:03.7575397Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7575635Z     method(*args, **kwargs)
2025-12-04T12:42:03.7575861Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7576096Z     method(*args, **kwargs)
2025-12-04T12:42:03.7576350Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7576586Z     with policy():
2025-12-04T12:42:03.7576804Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7577042Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7577554Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7578026Z 
2025-12-04T12:42:03.7578108Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7578608Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7578996Z 
2025-12-04T12:42:03.7579093Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7579300Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7579470Z ======================= 1 failed, 14 deselected in 8.65s =======================
2025-12-04T12:42:03.7579611Z Got exit code 1
2025-12-04T12:42:03.7579956Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7580411Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.7580808Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-2cd233f9856036a5.xml
2025-12-04T12:42:03.7581135Z ============================= test session starts ==============================
2025-12-04T12:42:03.7581346Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7581541Z cachedir: .pytest_cache
2025-12-04T12:42:03.7581767Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7582007Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7582129Z configfile: pytest.ini
2025-12-04T12:42:03.7582357Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7582918Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7583357Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7583792Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7584232Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7584381Z collected 15 items / 1 deselected / 14 selected
2025-12-04T12:42:03.7584524Z stepcurrent: skipping 1 already run items.
2025-12-04T12:42:03.7584652Z Running 14 items in this shard
2025-12-04T12:42:03.7584727Z 
2025-12-04T12:42:03.7585168Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:34:06.091000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 463082
2025-12-04T12:42:03.7585767Z I1204 12:34:06.092000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 463083
2025-12-04T12:42:03.7586115Z I1204 12:34:06.092000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 463084
2025-12-04T12:42:03.7586458Z I1204 12:34:06.093000 463013 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 463085
2025-12-04T12:42:03.7587337Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7588105Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7588887Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7589653Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7590392Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7591137Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7591874Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7592614Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7593967Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7595392Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7596843Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7598377Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7599901Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7601338Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7602772Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7604188Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7604484Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7604814Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7605290Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7605784Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7606252Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7606691Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7607117Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7607568Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7608019Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7608516Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7608982Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7609420Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7609862Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7610315Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7611084Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7611779Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7612117Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7612808Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7613413Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7613765Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7614167Z E1204 12:34:13.531000 463082 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7614495Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7614818Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7615321Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7615788Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7616253Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7616685Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7617108Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7617556Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7618018Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7618518Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7618969Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7619404Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7619847Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7620300Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7621034Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1101004800 and is now 2843738112.
2025-12-04T12:42:03.7621725Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7622062Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7622745Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7623352Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7623705Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7624106Z E1204 12:34:13.543000 463084 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7624432Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7624783Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7625256Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7625722Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7626190Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7626624Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7627049Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7627513Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7627976Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7628462Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7628911Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7629352Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7629791Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7630242Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7630972Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7631667Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7632003Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7632688Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7633289Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7633649Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7634112Z E1204 12:34:13.543000 463083 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7634443Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7634768Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7635240Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7635704Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7636166Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7636599Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7637047Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7637511Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7637961Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7638442Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7638894Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7639336Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7639777Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7640228Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7640962Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7641656Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7641996Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7642678Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7643282Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7643665Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7644068Z E1204 12:34:13.594000 463085 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7644305Z FAILED [8.6128s] [  7%]
2025-12-04T12:42:03.7644374Z 
2025-12-04T12:42:03.7644434Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7644714Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.7644982Z Traceback (most recent call last):
2025-12-04T12:42:03.7645228Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7645477Z     self._join_processes(fn)
2025-12-04T12:42:03.7645727Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7646007Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7646277Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7664272Z     raise RuntimeError(error)
2025-12-04T12:42:03.7664434Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7664598Z Traceback (most recent call last):
2025-12-04T12:42:03.7664848Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7665094Z     getattr(self, test_name)()
2025-12-04T12:42:03.7665327Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7665559Z     fn()
2025-12-04T12:42:03.7665766Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7666002Z     method(*args, **kwargs)
2025-12-04T12:42:03.7666221Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7666452Z     method(*args, **kwargs)
2025-12-04T12:42:03.7666669Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7666897Z     with policy():
2025-12-04T12:42:03.7667109Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7667340Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7667856Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7668369Z 
2025-12-04T12:42:03.7668449Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7668903Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7669278Z 
2025-12-04T12:42:03.7669373Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7669499Z 
2025-12-04T12:42:03.7669501Z 
2025-12-04T12:42:03.7669583Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7669786Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7670261Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-2cd233f9856036a5.xml -
2025-12-04T12:42:03.7670629Z =========================== short test summary info ============================
2025-12-04T12:42:03.7671082Z FAILED [8.6128s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7671508Z Traceback (most recent call last):
2025-12-04T12:42:03.7671759Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7672005Z     getattr(self, test_name)()
2025-12-04T12:42:03.7672238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7672472Z     fn()
2025-12-04T12:42:03.7672677Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7672921Z     method(*args, **kwargs)
2025-12-04T12:42:03.7673140Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7673384Z     method(*args, **kwargs)
2025-12-04T12:42:03.7673602Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7673823Z     with policy():
2025-12-04T12:42:03.7674033Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7674263Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7674769Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7675239Z 
2025-12-04T12:42:03.7675316Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7675764Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7676138Z 
2025-12-04T12:42:03.7676226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7676410Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7676573Z ======================= 1 failed, 1 deselected in 8.75s ========================
2025-12-04T12:42:03.7676709Z Got exit code 1
2025-12-04T12:42:03.7676807Z Retrying single test...
2025-12-04T12:42:03.7677096Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-57b0d531b0940846.xml
2025-12-04T12:42:03.7677415Z ============================= test session starts ==============================
2025-12-04T12:42:03.7677624Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7677813Z cachedir: .pytest_cache
2025-12-04T12:42:03.7678034Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7678314Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7678432Z configfile: pytest.ini
2025-12-04T12:42:03.7678663Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7679253Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7679692Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7680125Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7680563Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7680707Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7681124Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7681529Z Running 1 items in this shard
2025-12-04T12:42:03.7681603Z 
2025-12-04T12:42:03.7682028Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:34:17.171000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 463484
2025-12-04T12:42:03.7682643Z I1204 12:34:17.172000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 463485
2025-12-04T12:42:03.7682985Z I1204 12:34:17.173000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 463486
2025-12-04T12:42:03.7683323Z I1204 12:34:17.173000 463415 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 463487
2025-12-04T12:42:03.7684199Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7684955Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7685694Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7686438Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7687169Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7687909Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7688732Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7689473Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7690818Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7692263Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7693699Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7695110Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7696531Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7697945Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7699426Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7700831Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7701124Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7701448Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7701922Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7702423Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7702885Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7703314Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7703734Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7704179Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7704625Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7705076Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7705523Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7705958Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7706396Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7706842Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7707583Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112.
2025-12-04T12:42:03.7708308Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7708674Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7709360Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7709957Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7710306Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7710704Z E1204 12:34:24.616000 463487 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7711030Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7711348Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7711828Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7712344Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7712803Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7713233Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7713661Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7714110Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7714554Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7714995Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7715439Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7715871Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7716305Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7716750Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7717476Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7718226Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7718560Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7719243Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7719837Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7720182Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7720581Z E1204 12:34:24.640000 463485 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7720922Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7721263Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7721732Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7722191Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7722652Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7723079Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7723500Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7723944Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7724388Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7724829Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7725274Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7725706Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7726140Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7726584Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7727334Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7728021Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7728382Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7729061Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7729669Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7730016Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7730428Z E1204 12:34:24.648000 463484 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7730766Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7731085Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7731552Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7732011Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7732533Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7732966Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7733384Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7733829Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7734276Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7734719Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7735164Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7735600Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7736035Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7736482Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7737240Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7737929Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7738297Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7739016Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7739632Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7739992Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7740387Z E1204 12:34:24.710000 463486 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7740626Z FAILED [8.6151s] [100%]
2025-12-04T12:42:03.7740691Z 
2025-12-04T12:42:03.7740752Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7741033Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.7741300Z Traceback (most recent call last):
2025-12-04T12:42:03.7741546Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7741791Z     self._join_processes(fn)
2025-12-04T12:42:03.7742037Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7742300Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7742569Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7742826Z     raise RuntimeError(error)
2025-12-04T12:42:03.7742978Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7743138Z Traceback (most recent call last):
2025-12-04T12:42:03.7743377Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7743619Z     getattr(self, test_name)()
2025-12-04T12:42:03.7743849Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7744080Z     fn()
2025-12-04T12:42:03.7744282Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7744511Z     method(*args, **kwargs)
2025-12-04T12:42:03.7744731Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7744958Z     method(*args, **kwargs)
2025-12-04T12:42:03.7745176Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7745399Z     with policy():
2025-12-04T12:42:03.7745609Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7745867Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7746370Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112.
2025-12-04T12:42:03.7746838Z 
2025-12-04T12:42:03.7746913Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7747362Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7747735Z 
2025-12-04T12:42:03.7747825Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7747951Z 
2025-12-04T12:42:03.7747956Z 
2025-12-04T12:42:03.7748033Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7748277Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7748671Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-57b0d531b0940846.xml -
2025-12-04T12:42:03.7749051Z =========================== short test summary info ============================
2025-12-04T12:42:03.7749497Z FAILED [8.6151s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7749922Z Traceback (most recent call last):
2025-12-04T12:42:03.7750170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7750412Z     getattr(self, test_name)()
2025-12-04T12:42:03.7750646Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7750875Z     fn()
2025-12-04T12:42:03.7751073Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7751300Z     method(*args, **kwargs)
2025-12-04T12:42:03.7751517Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7751743Z     method(*args, **kwargs)
2025-12-04T12:42:03.7751958Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7752180Z     with policy():
2025-12-04T12:42:03.7752391Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7752619Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7753122Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1260388352 and is now 2843738112.
2025-12-04T12:42:03.7753586Z 
2025-12-04T12:42:03.7753662Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7754110Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7754484Z 
2025-12-04T12:42:03.7754599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7754786Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7754949Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.7755088Z Got exit code 1
2025-12-04T12:42:03.7755184Z Retrying single test...
2025-12-04T12:42:03.7755477Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ccab4a1f1ee4cc50.xml
2025-12-04T12:42:03.7755796Z ============================= test session starts ==============================
2025-12-04T12:42:03.7756006Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7756191Z cachedir: .pytest_cache
2025-12-04T12:42:03.7756412Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7756652Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7756773Z configfile: pytest.ini
2025-12-04T12:42:03.7757019Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7757584Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7758019Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7758488Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7758924Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7759071Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7759489Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7759895Z Running 1 items in this shard
2025-12-04T12:42:03.7759968Z 
2025-12-04T12:42:03.7760377Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:34:28.239000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 463886
2025-12-04T12:42:03.7760973Z I1204 12:34:28.240000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 463887
2025-12-04T12:42:03.7761320Z I1204 12:34:28.241000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 463888
2025-12-04T12:42:03.7761661Z I1204 12:34:28.242000 463817 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 463889
2025-12-04T12:42:03.7762529Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7763315Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7764084Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7764831Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7765565Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7766302Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7767038Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7767803Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7769182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7770597Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7772022Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7773438Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7774885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7776293Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7777707Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7779182Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7779477Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7779804Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7780275Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7780737Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7781201Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7781632Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7782057Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7782508Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7782956Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7783401Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7783876Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7784311Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7784746Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7785193Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7785925Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1105199104 and is now 2843738112.
2025-12-04T12:42:03.7786633Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7786968Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7787665Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7788298Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7788648Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7789047Z E1204 12:34:35.702000 463888 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7789369Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7789689Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7790668Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7791126Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7791588Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7792014Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7792433Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7792878Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7793322Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7793801Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7794248Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7794680Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7795113Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7795557Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7796289Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7797007Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7797339Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7798020Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7798663Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7799010Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7799405Z E1204 12:34:35.710000 463887 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7799728Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7800046Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7800512Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7800972Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7801433Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7801861Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7802281Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7802724Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7803201Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7803343Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7803611Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7803741Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7804010Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7804154Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7804716Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.7804838Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7805027Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7805487Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7805598Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7805801Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7805959Z E1204 12:34:35.751000 463886 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7806091Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7806242Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7806522Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7806668Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7806946Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7807060Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7807352Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7807513Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7807786Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7807928Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7808235Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7808365Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7808636Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7808792Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7810109Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.7810217Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7810408Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7810863Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7810971Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7811173Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7811332Z E1204 12:34:35.767000 463889 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7811372Z FAILED [8.6133s] [100%]
2025-12-04T12:42:03.7811376Z 
2025-12-04T12:42:03.7811435Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7811620Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.7811668Z Traceback (most recent call last):
2025-12-04T12:42:03.7811833Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7811877Z     self._join_processes(fn)
2025-12-04T12:42:03.7812051Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7812104Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7812284Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7812328Z     raise RuntimeError(error)
2025-12-04T12:42:03.7812436Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.7812482Z Traceback (most recent call last):
2025-12-04T12:42:03.7812647Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7812691Z     getattr(self, test_name)()
2025-12-04T12:42:03.7812853Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7812887Z     fn()
2025-12-04T12:42:03.7813040Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7813082Z     method(*args, **kwargs)
2025-12-04T12:42:03.7813273Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7813314Z     method(*args, **kwargs)
2025-12-04T12:42:03.7813469Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7813518Z     with policy():
2025-12-04T12:42:03.7813672Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7813724Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7814156Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1105199104 and is now 2843738112.
2025-12-04T12:42:03.7814159Z 
2025-12-04T12:42:03.7814238Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7814577Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7814580Z 
2025-12-04T12:42:03.7814672Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7814675Z 
2025-12-04T12:42:03.7814677Z 
2025-12-04T12:42:03.7814753Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7814845Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7815120Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ccab4a1f1ee4cc50.xml -
2025-12-04T12:42:03.7815182Z =========================== short test summary info ============================
2025-12-04T12:42:03.7815531Z FAILED [8.6133s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.7815579Z Traceback (most recent call last):
2025-12-04T12:42:03.7815746Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7815790Z     getattr(self, test_name)()
2025-12-04T12:42:03.7815950Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7815986Z     fn()
2025-12-04T12:42:03.7816138Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7816178Z     method(*args, **kwargs)
2025-12-04T12:42:03.7816329Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7816396Z     method(*args, **kwargs)
2025-12-04T12:42:03.7816548Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7816586Z     with policy():
2025-12-04T12:42:03.7816739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7816782Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7817214Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1105199104 and is now 2843738112.
2025-12-04T12:42:03.7817216Z 
2025-12-04T12:42:03.7817294Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7817633Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7817645Z 
2025-12-04T12:42:03.7817734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7817807Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7817870Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.7817907Z Got exit code 1
2025-12-04T12:42:03.7818231Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7818360Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.7818591Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-822400ffddb1145d.xml
2025-12-04T12:42:03.7818651Z ============================= test session starts ==============================
2025-12-04T12:42:03.7818766Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7818810Z cachedir: .pytest_cache
2025-12-04T12:42:03.7818968Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7819017Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7819059Z configfile: pytest.ini
2025-12-04T12:42:03.7819225Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7819584Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7819638Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7819982Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7820043Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7820100Z collected 15 items / 2 deselected / 13 selected
2025-12-04T12:42:03.7820156Z stepcurrent: skipping 2 already run items.
2025-12-04T12:42:03.7820199Z Running 13 items in this shard
2025-12-04T12:42:03.7820201Z 
2025-12-04T12:42:03.7820635Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:34:39.393000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 464288
2025-12-04T12:42:03.7820794Z I1204 12:34:39.394000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 464289
2025-12-04T12:42:03.7820949Z I1204 12:34:39.395000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 464290
2025-12-04T12:42:03.7821101Z I1204 12:34:39.395000 464219 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 464291
2025-12-04T12:42:03.7821784Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7821846Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7822516Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7822572Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7823239Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7823282Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7823952Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7823995Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7825266Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7825413Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7826677Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7826803Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7828080Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7828251Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7829514Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7829635Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7829769Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7829925Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7830207Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7830379Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7830659Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7830777Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7831045Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7831190Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7831463Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7831618Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7831906Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7832035Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7832307Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7832449Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7833002Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7833114Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7833302Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7833767Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7833876Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7834080Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7834237Z E1204 12:34:46.781000 464291 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7834369Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7834522Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7834821Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7834969Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7835246Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7835362Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7835629Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7835772Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7836054Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7836205Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7836477Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7836605Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7836876Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7837017Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7837567Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7837677Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7837867Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7838349Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7838458Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7838660Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7838818Z E1204 12:34:46.793000 464289 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7838980Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7839133Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7839414Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7839561Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7839836Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7839952Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7840220Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7840373Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7840653Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7840794Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7841066Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7841197Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7841470Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7841612Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7842160Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7842269Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7842462Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7842924Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7843032Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7843236Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7843412Z E1204 12:34:46.797000 464290 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7843545Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7843698Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7843975Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7844120Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7844398Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7844514Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7844791Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7844944Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7845213Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7845355Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7845625Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7845752Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7846024Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7846164Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7846713Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.7846822Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7847013Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7847468Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7847574Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7847795Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7847953Z E1204 12:34:46.801000 464288 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7847995Z FAILED [8.5138s] [  7%]
2025-12-04T12:42:03.7847997Z 
2025-12-04T12:42:03.7848054Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7848271Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.7848318Z Traceback (most recent call last):
2025-12-04T12:42:03.7848485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7848530Z     self._join_processes(fn)
2025-12-04T12:42:03.7848707Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7848781Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7848962Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7849020Z     raise RuntimeError(error)
2025-12-04T12:42:03.7849102Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7849148Z Traceback (most recent call last):
2025-12-04T12:42:03.7849308Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7849353Z     getattr(self, test_name)()
2025-12-04T12:42:03.7849510Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7849547Z     fn()
2025-12-04T12:42:03.7849700Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7849744Z     method(*args, **kwargs)
2025-12-04T12:42:03.7849894Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7849937Z     method(*args, **kwargs)
2025-12-04T12:42:03.7850086Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7850126Z     with policy():
2025-12-04T12:42:03.7850277Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7850320Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7850755Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7850758Z 
2025-12-04T12:42:03.7850836Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7851208Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7851212Z 
2025-12-04T12:42:03.7851305Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7851308Z 
2025-12-04T12:42:03.7851310Z 
2025-12-04T12:42:03.7851387Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7851475Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7851777Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-822400ffddb1145d.xml -
2025-12-04T12:42:03.7851841Z =========================== short test summary info ============================
2025-12-04T12:42:03.7852189Z FAILED [8.5138s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7852236Z Traceback (most recent call last):
2025-12-04T12:42:03.7852402Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7852446Z     getattr(self, test_name)()
2025-12-04T12:42:03.7852605Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7852644Z     fn()
2025-12-04T12:42:03.7852795Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7852848Z     method(*args, **kwargs)
2025-12-04T12:42:03.7853010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7853051Z     method(*args, **kwargs)
2025-12-04T12:42:03.7853200Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7853238Z     with policy():
2025-12-04T12:42:03.7853389Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7853431Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7853862Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7853865Z 
2025-12-04T12:42:03.7853943Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7854280Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7854284Z 
2025-12-04T12:42:03.7854372Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7854438Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7854501Z ======================= 1 failed, 2 deselected in 8.65s ========================
2025-12-04T12:42:03.7854542Z Got exit code 1
2025-12-04T12:42:03.7854582Z Retrying single test...
2025-12-04T12:42:03.7854811Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-62bd11c9460b4997.xml
2025-12-04T12:42:03.7854870Z ============================= test session starts ==============================
2025-12-04T12:42:03.7854984Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7855025Z cachedir: .pytest_cache
2025-12-04T12:42:03.7855189Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7855235Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7855275Z configfile: pytest.ini
2025-12-04T12:42:03.7855437Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7855815Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7855868Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7856213Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7856272Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7856329Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7856664Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7856708Z Running 1 items in this shard
2025-12-04T12:42:03.7856720Z 
2025-12-04T12:42:03.7857127Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:34:50.303000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 464690
2025-12-04T12:42:03.7857293Z I1204 12:34:50.304000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 464691
2025-12-04T12:42:03.7857446Z I1204 12:34:50.305000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 464692
2025-12-04T12:42:03.7857596Z I1204 12:34:50.305000 464621 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 464693
2025-12-04T12:42:03.7858312Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7858359Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7859030Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7859075Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7859745Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7859789Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7860485Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7860528Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7861803Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7861945Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7863225Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7863350Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7864653Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7864777Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7866063Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7866184Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7866319Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7866475Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7866760Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7866929Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7867208Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7867427Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7867698Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7867841Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7868111Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7868303Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7868574Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7868703Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7868976Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7869118Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7869671Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.7869781Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7870003Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7870465Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7870574Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7870779Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7870936Z E1204 12:34:57.752000 464690 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7871070Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7871236Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7871529Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7871675Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7871955Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7872074Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7872342Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7872486Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7872753Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7872895Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7873164Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7873293Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7873565Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7873706Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7874282Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7874390Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7874580Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7875039Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7875145Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7875351Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7875508Z E1204 12:34:57.761000 464692 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7875650Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7875812Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7876092Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7876238Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7876517Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7876634Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7876905Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7877046Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7877313Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7877455Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7877722Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7877853Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7878124Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7878309Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7878882Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440.
2025-12-04T12:42:03.7878991Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7879184Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7879642Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7879752Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7879955Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7880145Z E1204 12:34:57.780000 464693 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7880275Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7880428Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7880709Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7880856Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7881133Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7881249Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7881521Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7881661Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7881931Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7882071Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7882340Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7882469Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7882742Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7882916Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7883468Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7883578Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7883771Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7884228Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7884350Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7884569Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7884728Z E1204 12:34:57.829000 464691 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7884773Z FAILED [8.6149s] [100%]
2025-12-04T12:42:03.7884776Z 
2025-12-04T12:42:03.7884833Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7885021Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.7885071Z Traceback (most recent call last):
2025-12-04T12:42:03.7885239Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7885285Z     self._join_processes(fn)
2025-12-04T12:42:03.7885463Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7885518Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7885698Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7885742Z     raise RuntimeError(error)
2025-12-04T12:42:03.7885825Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7885870Z Traceback (most recent call last):
2025-12-04T12:42:03.7886035Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7886080Z     getattr(self, test_name)()
2025-12-04T12:42:03.7886241Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7886277Z     fn()
2025-12-04T12:42:03.7886432Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7886477Z     method(*args, **kwargs)
2025-12-04T12:42:03.7886628Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7886673Z     method(*args, **kwargs)
2025-12-04T12:42:03.7886822Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7886862Z     with policy():
2025-12-04T12:42:03.7887035Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7887080Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7887513Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.7887516Z 
2025-12-04T12:42:03.7887593Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7887931Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7887934Z 
2025-12-04T12:42:03.7888027Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7888031Z 
2025-12-04T12:42:03.7888092Z Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.7888188Z Traceback (most recent call last):
2025-12-04T12:42:03.7888353Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7888412Z     getattr(self, test_name)()
2025-12-04T12:42:03.7888574Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7888610Z     fn()
2025-12-04T12:42:03.7888764Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7888804Z     method(*args, **kwargs)
2025-12-04T12:42:03.7888963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7889002Z     method(*args, **kwargs)
2025-12-04T12:42:03.7889157Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7889195Z     with policy():
2025-12-04T12:42:03.7889351Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7889393Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7889829Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7889832Z 
2025-12-04T12:42:03.7889907Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7890249Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7890252Z 
2025-12-04T12:42:03.7890342Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7890345Z 
2025-12-04T12:42:03.7890347Z 
2025-12-04T12:42:03.7890423Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7890512Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7890782Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-62bd11c9460b4997.xml -
2025-12-04T12:42:03.7890845Z =========================== short test summary info ============================
2025-12-04T12:42:03.7891218Z FAILED [8.6149s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.7891268Z Traceback (most recent call last):
2025-12-04T12:42:03.7891432Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7891477Z     getattr(self, test_name)()
2025-12-04T12:42:03.7891638Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7891673Z     fn()
2025-12-04T12:42:03.7891825Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7891865Z     method(*args, **kwargs)
2025-12-04T12:42:03.7892016Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7892057Z     method(*args, **kwargs)
2025-12-04T12:42:03.7892210Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7892259Z     with policy():
2025-12-04T12:42:03.7892423Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7892464Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7892895Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.7892897Z 
2025-12-04T12:42:03.7892971Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7893310Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7893313Z 
2025-12-04T12:42:03.7893400Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7893402Z 
2025-12-04T12:42:03.7893461Z Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.7893508Z Traceback (most recent call last):
2025-12-04T12:42:03.7893671Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7893714Z     getattr(self, test_name)()
2025-12-04T12:42:03.7893873Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7893909Z     fn()
2025-12-04T12:42:03.7894061Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7894104Z     method(*args, **kwargs)
2025-12-04T12:42:03.7894254Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7894296Z     method(*args, **kwargs)
2025-12-04T12:42:03.7894445Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7894484Z     with policy():
2025-12-04T12:42:03.7894637Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7894682Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7895131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7895136Z 
2025-12-04T12:42:03.7895210Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7895545Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7895548Z 
2025-12-04T12:42:03.7895634Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7895700Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7895761Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.7895800Z Got exit code 1
2025-12-04T12:42:03.7895840Z Retrying single test...
2025-12-04T12:42:03.7896070Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f354e4e5939e2ac6.xml
2025-12-04T12:42:03.7896138Z ============================= test session starts ==============================
2025-12-04T12:42:03.7896270Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7896312Z cachedir: .pytest_cache
2025-12-04T12:42:03.7896472Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7896519Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7896562Z configfile: pytest.ini
2025-12-04T12:42:03.7896726Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7897087Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7897141Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7897484Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7897545Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7897603Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7897932Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7897976Z Running 1 items in this shard
2025-12-04T12:42:03.7897978Z 
2025-12-04T12:42:03.7898426Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:35:01.468000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 465092
2025-12-04T12:42:03.7898585Z I1204 12:35:01.468000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 465093
2025-12-04T12:42:03.7898741Z I1204 12:35:01.469000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 465094
2025-12-04T12:42:03.7898893Z I1204 12:35:01.470000 465023 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 465095
2025-12-04T12:42:03.7899599Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7899647Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7900315Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7900359Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7901033Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7901100Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7901771Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7901812Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7903091Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7903222Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7904506Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7904631Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7905898Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7906039Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7907360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7907483Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7907619Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7907775Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7908060Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7908259Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7908539Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7908656Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7908925Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7909095Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7909369Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7909511Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7909782Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7909910Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7910183Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7910335Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7910904Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440.
2025-12-04T12:42:03.7911016Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7911209Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7911668Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7911778Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7911983Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7912141Z E1204 12:35:11.491000 465095 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7912273Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7912427Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7912709Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7912858Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7913134Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7913250Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7913541Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7913686Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7913953Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7914095Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7914393Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7914524Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7914812Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7914963Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7915515Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7915626Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7915819Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7916278Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7916386Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7916591Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7916750Z E1204 12:35:11.500000 465094 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7916882Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7917034Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7917314Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7917464Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7917792Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7917910Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7918216Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7918360Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7918629Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7918771Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7919042Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7919183Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7919467Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7919608Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7920164Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.7920273Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7920466Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7920922Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7921029Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7921234Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7921392Z E1204 12:35:11.541000 465092 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7921525Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7921677Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7921959Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7922109Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7922413Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7922531Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7922799Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7922940Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7923207Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7923350Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7923629Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7923769Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7924041Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7924182Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7924735Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7924844Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7925034Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7925532Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7925640Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7925845Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7926004Z E1204 12:35:11.545000 465093 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7926048Z FAILED [11.2178s] [100%]
2025-12-04T12:42:03.7926050Z 
2025-12-04T12:42:03.7926108Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7926293Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.7926340Z Traceback (most recent call last):
2025-12-04T12:42:03.7926526Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7926572Z     self._join_processes(fn)
2025-12-04T12:42:03.7926748Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7926803Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7926983Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7927028Z     raise RuntimeError(error)
2025-12-04T12:42:03.7927111Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7927157Z Traceback (most recent call last):
2025-12-04T12:42:03.7927318Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7927361Z     getattr(self, test_name)()
2025-12-04T12:42:03.7927522Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7927572Z     fn()
2025-12-04T12:42:03.7927724Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7927777Z     method(*args, **kwargs)
2025-12-04T12:42:03.7927928Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7927970Z     method(*args, **kwargs)
2025-12-04T12:42:03.7928121Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7928196Z     with policy():
2025-12-04T12:42:03.7928348Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7928392Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7928828Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440.
2025-12-04T12:42:03.7928832Z 
2025-12-04T12:42:03.7928910Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7929249Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7929253Z 
2025-12-04T12:42:03.7929341Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7929343Z 
2025-12-04T12:42:03.7929345Z 
2025-12-04T12:42:03.7929424Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7929512Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7929787Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f354e4e5939e2ac6.xml -
2025-12-04T12:42:03.7929848Z =========================== short test summary info ============================
2025-12-04T12:42:03.7930198Z FAILED [11.2178s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7930244Z Traceback (most recent call last):
2025-12-04T12:42:03.7930411Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7930493Z     getattr(self, test_name)()
2025-12-04T12:42:03.7930657Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7930694Z     fn()
2025-12-04T12:42:03.7930846Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7930888Z     method(*args, **kwargs)
2025-12-04T12:42:03.7931039Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7931079Z     method(*args, **kwargs)
2025-12-04T12:42:03.7931229Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7931269Z     with policy():
2025-12-04T12:42:03.7931421Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7931466Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7931898Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1105199104 and is now 2820669440.
2025-12-04T12:42:03.7931924Z 
2025-12-04T12:42:03.7932001Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7932337Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7932342Z 
2025-12-04T12:42:03.7932429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7932496Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7932559Z ====================== 1 failed, 14 deselected in 11.35s =======================
2025-12-04T12:42:03.7932599Z Got exit code 1
2025-12-04T12:42:03.7932882Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.7933013Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.7933242Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-6cfddfd95d5027e8.xml
2025-12-04T12:42:03.7933302Z ============================= test session starts ==============================
2025-12-04T12:42:03.7933415Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7933460Z cachedir: .pytest_cache
2025-12-04T12:42:03.7933618Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7933667Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7933710Z configfile: pytest.ini
2025-12-04T12:42:03.7933875Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7934234Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7934289Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7934657Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7934716Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7934773Z collected 15 items / 3 deselected / 12 selected
2025-12-04T12:42:03.7934828Z stepcurrent: skipping 3 already run items.
2025-12-04T12:42:03.7934872Z Running 12 items in this shard
2025-12-04T12:42:03.7934874Z 
2025-12-04T12:42:03.7935282Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:35:15.336000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 465494
2025-12-04T12:42:03.7935440Z I1204 12:35:15.337000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 465495
2025-12-04T12:42:03.7935593Z I1204 12:35:15.337000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 465496
2025-12-04T12:42:03.7935745Z I1204 12:35:15.338000 465425 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 465497
2025-12-04T12:42:03.7936439Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7936494Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7937165Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7937209Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7937880Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7937923Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7938626Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7938671Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7939970Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7940097Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7941364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7941516Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7942787Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7942909Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7944173Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7944318Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7944452Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7944611Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7944896Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7945045Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7945327Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7945445Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7945728Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7945881Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7946152Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7946292Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7946564Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7946697Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7946968Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7947110Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7947667Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2820669440.
2025-12-04T12:42:03.7947780Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7947971Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7948463Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7955030Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7955298Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7955461Z E1204 12:35:22.800000 465497 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7955594Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7955747Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7956030Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7956178Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7956465Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7956597Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7956883Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7957023Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7957291Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7957433Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7957702Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7957830Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7958103Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7958295Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7958852Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7958965Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7959156Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7959642Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7959753Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7959957Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7960120Z E1204 12:35:22.851000 465495 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7960251Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7960405Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7960688Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7960837Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7961127Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7961256Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7961526Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7961665Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7961936Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7962076Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7962345Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7962472Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7962744Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7962886Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7963437Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.7963545Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7963733Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7964209Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7964319Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7964523Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7964682Z E1204 12:35:22.852000 465494 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.7964851Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7965009Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7965289Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7965457Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7965733Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7965848Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7966119Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7966259Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7966530Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7966670Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7966940Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7967067Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7967339Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7967479Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7968027Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7968135Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7968396Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7968853Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7968960Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7969164Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7969322Z E1204 12:35:22.864000 465496 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.7969363Z FAILED [8.6153s] [  8%]
2025-12-04T12:42:03.7969368Z 
2025-12-04T12:42:03.7969427Z =================================== FAILURES ===================================
2025-12-04T12:42:03.7969628Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.7969690Z Traceback (most recent call last):
2025-12-04T12:42:03.7969855Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.7969901Z     self._join_processes(fn)
2025-12-04T12:42:03.7970074Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.7970129Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.7970308Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.7970355Z     raise RuntimeError(error)
2025-12-04T12:42:03.7970436Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7970484Z Traceback (most recent call last):
2025-12-04T12:42:03.7970645Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7970689Z     getattr(self, test_name)()
2025-12-04T12:42:03.7970846Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7970881Z     fn()
2025-12-04T12:42:03.7971032Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7971074Z     method(*args, **kwargs)
2025-12-04T12:42:03.7971224Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7971264Z     method(*args, **kwargs)
2025-12-04T12:42:03.7971415Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7971455Z     with policy():
2025-12-04T12:42:03.7971607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7971648Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7972079Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2820669440.
2025-12-04T12:42:03.7972082Z 
2025-12-04T12:42:03.7972157Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7972520Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7972524Z 
2025-12-04T12:42:03.7972614Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7972617Z 
2025-12-04T12:42:03.7972620Z 
2025-12-04T12:42:03.7972698Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.7972787Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.7973061Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-6cfddfd95d5027e8.xml -
2025-12-04T12:42:03.7973122Z =========================== short test summary info ============================
2025-12-04T12:42:03.7973471Z FAILED [8.6153s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.7973529Z Traceback (most recent call last):
2025-12-04T12:42:03.7973704Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7973747Z     getattr(self, test_name)()
2025-12-04T12:42:03.7973907Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7973943Z     fn()
2025-12-04T12:42:03.7974095Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7974134Z     method(*args, **kwargs)
2025-12-04T12:42:03.7974285Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7974327Z     method(*args, **kwargs)
2025-12-04T12:42:03.7974478Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7974516Z     with policy():
2025-12-04T12:42:03.7974669Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7974710Z     raise RuntimeError(msg)
2025-12-04T12:42:03.7975141Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2820669440.
2025-12-04T12:42:03.7975143Z 
2025-12-04T12:42:03.7975217Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7975558Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7975561Z 
2025-12-04T12:42:03.7975649Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7975714Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.7975777Z ======================= 1 failed, 3 deselected in 8.75s ========================
2025-12-04T12:42:03.7975815Z Got exit code 1
2025-12-04T12:42:03.7975855Z Retrying single test...
2025-12-04T12:42:03.7976084Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f7d3e22584c96aa7.xml
2025-12-04T12:42:03.7976141Z ============================= test session starts ==============================
2025-12-04T12:42:03.7976280Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.7976321Z cachedir: .pytest_cache
2025-12-04T12:42:03.7976481Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.7976529Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.7976569Z configfile: pytest.ini
2025-12-04T12:42:03.7976734Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.7977095Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7977148Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.7977493Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.7977563Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.7977631Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.7977960Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7978004Z Running 1 items in this shard
2025-12-04T12:42:03.7978006Z 
2025-12-04T12:42:03.7978522Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:35:26.592000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 465896
2025-12-04T12:42:03.7978678Z I1204 12:35:26.593000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 465897
2025-12-04T12:42:03.7978830Z I1204 12:35:26.593000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 465898
2025-12-04T12:42:03.7978982Z I1204 12:35:26.594000 465827 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 465899
2025-12-04T12:42:03.7979672Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7979718Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7980393Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7980437Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7981136Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7981179Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7981852Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.7981895Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.7983172Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7983326Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7984598Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7984725Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7986016Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7986139Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7987402Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.7987533Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.7987677Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7987833Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7988117Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7988300Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7988581Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7988700Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7988970Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7989111Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7989383Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7989523Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7989793Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7989923Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7990194Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7990333Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7990918Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7991031Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7991221Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7991679Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7991799Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7992005Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7992177Z E1204 12:35:34.019000 465899 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.7992308Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7992462Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7992743Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7992892Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7993169Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7993286Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7993554Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7993696Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7993965Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7994105Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7994374Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7994501Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7994795Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7994935Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.7995484Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.7995594Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7995783Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.7996241Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.7996366Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.7996568Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.7996726Z E1204 12:35:34.023000 465897 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.7996858Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.7997013Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.7997293Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.7997440Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.7997715Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.7997830Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.7998099Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7998291Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7998561Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.7998700Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.7998968Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.7999124Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.7999397Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.7999540Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8000087Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8000195Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8000384Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8000861Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8000980Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8001182Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8001340Z E1204 12:35:34.049000 465896 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8001470Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8001623Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8001904Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8002052Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8002329Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8002445Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8002713Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8002855Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8003124Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8003264Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8003571Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8003699Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8003971Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8004110Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8004660Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8004779Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8004968Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8005435Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8005543Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8005749Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8005905Z E1204 12:35:34.057000 465898 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8005947Z FAILED [8.7142s] [100%]
2025-12-04T12:42:03.8005950Z 
2025-12-04T12:42:03.8006006Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8006188Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8006235Z Traceback (most recent call last):
2025-12-04T12:42:03.8006398Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8006443Z     self._join_processes(fn)
2025-12-04T12:42:03.8006616Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8006671Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8006849Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8006894Z     raise RuntimeError(error)
2025-12-04T12:42:03.8006975Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8007021Z Traceback (most recent call last):
2025-12-04T12:42:03.8007181Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8007223Z     getattr(self, test_name)()
2025-12-04T12:42:03.8007381Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8007416Z     fn()
2025-12-04T12:42:03.8007587Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8007628Z     method(*args, **kwargs)
2025-12-04T12:42:03.8007780Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8007819Z     method(*args, **kwargs)
2025-12-04T12:42:03.8007969Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8008006Z     with policy():
2025-12-04T12:42:03.8008190Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8008233Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8008667Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8008671Z 
2025-12-04T12:42:03.8008762Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8009098Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8009113Z 
2025-12-04T12:42:03.8009202Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8009204Z 
2025-12-04T12:42:03.8009206Z 
2025-12-04T12:42:03.8009281Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8009369Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8009644Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f7d3e22584c96aa7.xml -
2025-12-04T12:42:03.8009705Z =========================== short test summary info ============================
2025-12-04T12:42:03.8010053Z FAILED [8.7142s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8010101Z Traceback (most recent call last):
2025-12-04T12:42:03.8010265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8010308Z     getattr(self, test_name)()
2025-12-04T12:42:03.8010468Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8010502Z     fn()
2025-12-04T12:42:03.8010655Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8010696Z     method(*args, **kwargs)
2025-12-04T12:42:03.8010848Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8010888Z     method(*args, **kwargs)
2025-12-04T12:42:03.8011037Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8011074Z     with policy():
2025-12-04T12:42:03.8011225Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8011265Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8011721Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8011725Z 
2025-12-04T12:42:03.8011800Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8012137Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8012140Z 
2025-12-04T12:42:03.8012226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8012289Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8012350Z ======================= 1 failed, 14 deselected in 8.85s =======================
2025-12-04T12:42:03.8012388Z Got exit code 1
2025-12-04T12:42:03.8012427Z Retrying single test...
2025-12-04T12:42:03.8012658Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-b0ce32d3573b3332.xml
2025-12-04T12:42:03.8012728Z ============================= test session starts ==============================
2025-12-04T12:42:03.8012852Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8012893Z cachedir: .pytest_cache
2025-12-04T12:42:03.8013053Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8013098Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8013138Z configfile: pytest.ini
2025-12-04T12:42:03.8013302Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8013663Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8013716Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8014059Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8014119Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8014176Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8014503Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8014546Z Running 1 items in this shard
2025-12-04T12:42:03.8014550Z 
2025-12-04T12:42:03.8014960Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:35:37.708000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 466298
2025-12-04T12:42:03.8015118Z I1204 12:35:37.709000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 466299
2025-12-04T12:42:03.8015270Z I1204 12:35:37.710000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 466300
2025-12-04T12:42:03.8015458Z I1204 12:35:37.710000 466229 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 466301
2025-12-04T12:42:03.8016159Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8016205Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8016877Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8016921Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8017690Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8017760Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8018471Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:240: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8018511Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8019797Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.8019925Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.8021228Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.8021353Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.8022617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.8022763Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.8024034Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T12:42:03.8024159Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:42:03.8024292Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8024447Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8024732Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8024882Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8025161Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8025278Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8025550Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8025722Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8025993Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8026134Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8026402Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8026531Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8026804Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8026956Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8027520Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8027632Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8027824Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8028315Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8028426Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8028632Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8028789Z E1204 12:35:44.977000 466298 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8028921Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8029075Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8029355Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8029503Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8029779Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8029895Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8030191Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8030335Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8030604Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8032990Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8033272Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8033404Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8033699Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8033857Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8034424Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2820669440.
2025-12-04T12:42:03.8034534Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8034727Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8035185Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8035294Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8035576Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8035735Z E1204 12:35:44.978000 466301 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8035869Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8036021Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8036304Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8036451Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8036742Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8036860Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8037132Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8037274Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8037621Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8037763Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8038031Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8038203Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8038493Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8038635Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8039187Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8039295Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8039488Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8039948Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8040057Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8040261Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8040419Z E1204 12:35:44.994000 466299 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8040550Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8040702Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8040982Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8041127Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8041418Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8041534Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8041803Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8041972Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8042240Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8042381Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8042662Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8042801Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8043072Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8043213Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8043762Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8043871Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8044060Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8044519Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8044627Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8044828Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8044986Z E1204 12:35:45.038000 466300 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8045028Z FAILED [8.4138s] [100%]
2025-12-04T12:42:03.8045030Z 
2025-12-04T12:42:03.8045088Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8045273Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8045319Z Traceback (most recent call last):
2025-12-04T12:42:03.8045493Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8045538Z     self._join_processes(fn)
2025-12-04T12:42:03.8045713Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8045768Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8045948Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8045991Z     raise RuntimeError(error)
2025-12-04T12:42:03.8046072Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8046130Z Traceback (most recent call last):
2025-12-04T12:42:03.8046292Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8046334Z     getattr(self, test_name)()
2025-12-04T12:42:03.8046494Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8046538Z     fn()
2025-12-04T12:42:03.8046693Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8046748Z     method(*args, **kwargs)
2025-12-04T12:42:03.8046901Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8046940Z     method(*args, **kwargs)
2025-12-04T12:42:03.8047093Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8047132Z     with policy():
2025-12-04T12:42:03.8047283Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8047326Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8047758Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8047762Z 
2025-12-04T12:42:03.8047838Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8048213Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8048215Z 
2025-12-04T12:42:03.8048306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8048308Z 
2025-12-04T12:42:03.8048310Z 
2025-12-04T12:42:03.8048388Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8048476Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8048751Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-b0ce32d3573b3332.xml -
2025-12-04T12:42:03.8048811Z =========================== short test summary info ============================
2025-12-04T12:42:03.8049161Z FAILED [8.4138s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8049207Z Traceback (most recent call last):
2025-12-04T12:42:03.8049374Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8049431Z     getattr(self, test_name)()
2025-12-04T12:42:03.8049593Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8049630Z     fn()
2025-12-04T12:42:03.8049783Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8049823Z     method(*args, **kwargs)
2025-12-04T12:42:03.8049975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8050013Z     method(*args, **kwargs)
2025-12-04T12:42:03.8050178Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8050215Z     with policy():
2025-12-04T12:42:03.8050368Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8050410Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8050840Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8050870Z 
2025-12-04T12:42:03.8050945Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8051283Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8051286Z 
2025-12-04T12:42:03.8051375Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8051439Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8051503Z ======================= 1 failed, 14 deselected in 8.55s =======================
2025-12-04T12:42:03.8051541Z Got exit code 1
2025-12-04T12:42:03.8051825Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8051956Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8052185Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61ba864e2f05046f.xml
2025-12-04T12:42:03.8052243Z ============================= test session starts ==============================
2025-12-04T12:42:03.8052355Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8052398Z cachedir: .pytest_cache
2025-12-04T12:42:03.8052556Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8052604Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8052646Z configfile: pytest.ini
2025-12-04T12:42:03.8052809Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8053171Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8053223Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8053586Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8053646Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8053702Z collected 15 items / 4 deselected / 11 selected
2025-12-04T12:42:03.8053758Z stepcurrent: skipping 4 already run items.
2025-12-04T12:42:03.8053800Z Running 11 items in this shard
2025-12-04T12:42:03.8053804Z 
2025-12-04T12:42:03.8054228Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:35:48.791000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 466700
2025-12-04T12:42:03.8054385Z I1204 12:35:48.792000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 466701
2025-12-04T12:42:03.8054538Z I1204 12:35:48.792000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 466702
2025-12-04T12:42:03.8054688Z I1204 12:35:48.793000 466631 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 466703
2025-12-04T12:42:03.8055384Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8055439Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8056112Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8056156Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8056825Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8056866Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8057540Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8057584Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8058086Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8058136Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8058680Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8058729Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8059231Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8059277Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8059766Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8059836Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8059973Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8060130Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8060414Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8060563Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8060843Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8060963Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8061235Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8061380Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8061648Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8061789Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8062061Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8062191Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8062465Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8062616Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8063174Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8063285Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8063491Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8063956Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8064086Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8064301Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8064459Z E1204 12:35:56.245000 466701 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8064591Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8064742Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8065023Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8065169Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8065447Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8065564Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8065832Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8066019Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8066292Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8066434Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8066702Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8066831Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8067111Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8067253Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8067818Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8067925Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8068116Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8068617Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8068757Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8068963Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8069122Z E1204 12:35:56.275000 466700 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8069254Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8069406Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8069686Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8069834Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8070113Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8070231Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8070498Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8070640Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8070908Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8071049Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8071314Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8071457Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8071731Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8071873Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8072439Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112.
2025-12-04T12:42:03.8072547Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8072747Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8073213Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8073321Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8073523Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8073680Z E1204 12:35:56.281000 466703 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8073814Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8073966Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8074247Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8074394Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8074671Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8074786Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8075056Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8075198Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8075465Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8075606Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8075882Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8076011Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8076282Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8076435Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8076988Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8077104Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8077306Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8077763Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8077871Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8078074Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8078278Z E1204 12:35:56.302000 466702 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8078319Z FAILED [8.8150s] [  9%]
2025-12-04T12:42:03.8078322Z 
2025-12-04T12:42:03.8078378Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8078565Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8078613Z Traceback (most recent call last):
2025-12-04T12:42:03.8078778Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8078821Z     self._join_processes(fn)
2025-12-04T12:42:03.8078995Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8079049Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8079227Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8079272Z     raise RuntimeError(error)
2025-12-04T12:42:03.8079353Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8079398Z Traceback (most recent call last):
2025-12-04T12:42:03.8079562Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8079604Z     getattr(self, test_name)()
2025-12-04T12:42:03.8079763Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8079797Z     fn()
2025-12-04T12:42:03.8079963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8080004Z     method(*args, **kwargs)
2025-12-04T12:42:03.8080156Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8080197Z     method(*args, **kwargs)
2025-12-04T12:42:03.8080347Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8080385Z     with policy():
2025-12-04T12:42:03.8080552Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8080596Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8081030Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8081047Z 
2025-12-04T12:42:03.8081124Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8081476Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8081479Z 
2025-12-04T12:42:03.8081567Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8081570Z 
2025-12-04T12:42:03.8081572Z 
2025-12-04T12:42:03.8081648Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8081736Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8082010Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61ba864e2f05046f.xml -
2025-12-04T12:42:03.8082071Z =========================== short test summary info ============================
2025-12-04T12:42:03.8082422Z FAILED [8.8150s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8082468Z Traceback (most recent call last):
2025-12-04T12:42:03.8082636Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8082678Z     getattr(self, test_name)()
2025-12-04T12:42:03.8082840Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8082873Z     fn()
2025-12-04T12:42:03.8083026Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8083067Z     method(*args, **kwargs)
2025-12-04T12:42:03.8083220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8083259Z     method(*args, **kwargs)
2025-12-04T12:42:03.8083410Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8083449Z     with policy():
2025-12-04T12:42:03.8083602Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8083643Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8084086Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8084089Z 
2025-12-04T12:42:03.8084165Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8084504Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8084507Z 
2025-12-04T12:42:03.8084607Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8084671Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8084734Z ======================= 1 failed, 4 deselected in 8.98s ========================
2025-12-04T12:42:03.8084771Z Got exit code 1
2025-12-04T12:42:03.8084811Z Retrying single test...
2025-12-04T12:42:03.8085054Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8fc19adac4e61b04.xml
2025-12-04T12:42:03.8085126Z ============================= test session starts ==============================
2025-12-04T12:42:03.8085240Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8085281Z cachedir: .pytest_cache
2025-12-04T12:42:03.8085439Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8085484Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8085525Z configfile: pytest.ini
2025-12-04T12:42:03.8085687Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8086045Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8086097Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8086440Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8086498Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8086557Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8086889Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8086935Z Running 1 items in this shard
2025-12-04T12:42:03.8086939Z 
2025-12-04T12:42:03.8087347Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:36:00.205000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 467102
2025-12-04T12:42:03.8087503Z I1204 12:36:00.205000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 467103
2025-12-04T12:42:03.8087656Z I1204 12:36:00.206000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 467104
2025-12-04T12:42:03.8087805Z I1204 12:36:00.207000 467033 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 467105
2025-12-04T12:42:03.8088530Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8088576Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8089260Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8089305Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8089974Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8090044Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8090712Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8090753Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8091252Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8091302Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8091794Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8091843Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8092330Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8092378Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8092882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8092931Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8093067Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8093226Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8093509Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8095085Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8095368Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8095497Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8095769Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8095920Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8096192Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8096330Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8096601Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8096732Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8097002Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8097144Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8097701Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2843738112.
2025-12-04T12:42:03.8097812Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8098003Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8098501Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8098624Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8098828Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8098989Z E1204 12:36:07.743000 467105 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8099121Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8099274Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8099567Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8099715Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8099993Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8100138Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8100407Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8100548Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8100818Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8100958Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8101229Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8101360Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8101630Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8101772Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8102323Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8102434Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8102624Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8103090Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8103199Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8103400Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8103559Z E1204 12:36:07.791000 467102 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8103706Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8103861Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8104140Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8104298Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8104584Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8104698Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8104966Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8105107Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8105375Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8105515Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8105783Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8105911Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8106184Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8106325Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8106876Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8106985Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8107173Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8107640Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8107749Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8107949Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8108122Z E1204 12:36:07.805000 467103 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8108284Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8108438Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8108731Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8108890Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8109167Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8109282Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8109551Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8109691Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8109961Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8110098Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8110367Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8110495Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8110766Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8110910Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8111460Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8111581Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8111770Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8112228Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8112336Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8112549Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8112709Z E1204 12:36:07.812000 467104 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8112748Z FAILED [8.9157s] [100%]
2025-12-04T12:42:03.8112760Z 
2025-12-04T12:42:03.8112816Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8113000Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8113057Z Traceback (most recent call last):
2025-12-04T12:42:03.8113220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8113265Z     self._join_processes(fn)
2025-12-04T12:42:03.8113438Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8113491Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8113670Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8113715Z     raise RuntimeError(error)
2025-12-04T12:42:03.8113794Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8113840Z Traceback (most recent call last):
2025-12-04T12:42:03.8114002Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8114045Z     getattr(self, test_name)()
2025-12-04T12:42:03.8114203Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8114238Z     fn()
2025-12-04T12:42:03.8114390Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8114430Z     method(*args, **kwargs)
2025-12-04T12:42:03.8114582Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8114622Z     method(*args, **kwargs)
2025-12-04T12:42:03.8114774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8114812Z     with policy():
2025-12-04T12:42:03.8114966Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8115007Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8115447Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8115450Z 
2025-12-04T12:42:03.8115524Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8115877Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8115881Z 
2025-12-04T12:42:03.8115969Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8115972Z 
2025-12-04T12:42:03.8116032Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8116078Z Traceback (most recent call last):
2025-12-04T12:42:03.8116249Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8116293Z     getattr(self, test_name)()
2025-12-04T12:42:03.8116473Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8116520Z     fn()
2025-12-04T12:42:03.8116673Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8116726Z     method(*args, **kwargs)
2025-12-04T12:42:03.8116875Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8116925Z     method(*args, **kwargs)
2025-12-04T12:42:03.8117074Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8117112Z     with policy():
2025-12-04T12:42:03.8117263Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8117305Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8117736Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2843738112.
2025-12-04T12:42:03.8117739Z 
2025-12-04T12:42:03.8117814Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8118192Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8118195Z 
2025-12-04T12:42:03.8118281Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8118285Z 
2025-12-04T12:42:03.8118287Z 
2025-12-04T12:42:03.8118363Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8118450Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8118725Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8fc19adac4e61b04.xml -
2025-12-04T12:42:03.8118786Z =========================== short test summary info ============================
2025-12-04T12:42:03.8119138Z FAILED [8.9157s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8119183Z Traceback (most recent call last):
2025-12-04T12:42:03.8119349Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8119391Z     getattr(self, test_name)()
2025-12-04T12:42:03.8119568Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8119604Z     fn()
2025-12-04T12:42:03.8119754Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8119797Z     method(*args, **kwargs)
2025-12-04T12:42:03.8119948Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8119987Z     method(*args, **kwargs)
2025-12-04T12:42:03.8120136Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8120174Z     with policy():
2025-12-04T12:42:03.8120338Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8120380Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8120814Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8120828Z 
2025-12-04T12:42:03.8120915Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8121254Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8121258Z 
2025-12-04T12:42:03.8121346Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8121348Z 
2025-12-04T12:42:03.8121408Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8121453Z Traceback (most recent call last):
2025-12-04T12:42:03.8121618Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8121659Z     getattr(self, test_name)()
2025-12-04T12:42:03.8121821Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8121856Z     fn()
2025-12-04T12:42:03.8122010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8122049Z     method(*args, **kwargs)
2025-12-04T12:42:03.8122199Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8122239Z     method(*args, **kwargs)
2025-12-04T12:42:03.8122389Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8122426Z     with policy():
2025-12-04T12:42:03.8122578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8122620Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8123053Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 958398464 and is now 2843738112.
2025-12-04T12:42:03.8123056Z 
2025-12-04T12:42:03.8123129Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8123465Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8123468Z 
2025-12-04T12:42:03.8123565Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8123630Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8123697Z ======================= 1 failed, 14 deselected in 9.06s =======================
2025-12-04T12:42:03.8123735Z Got exit code 1
2025-12-04T12:42:03.8123777Z Retrying single test...
2025-12-04T12:42:03.8124002Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-549814058b19f53a.xml
2025-12-04T12:42:03.8124062Z ============================= test session starts ==============================
2025-12-04T12:42:03.8124183Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8124226Z cachedir: .pytest_cache
2025-12-04T12:42:03.8124383Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8124430Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8124470Z configfile: pytest.ini
2025-12-04T12:42:03.8124643Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8125013Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8125064Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8125410Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8125467Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8125525Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8125855Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8125902Z Running 1 items in this shard
2025-12-04T12:42:03.8125904Z 
2025-12-04T12:42:03.8126317Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:36:11.683000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 467504
2025-12-04T12:42:03.8126473Z I1204 12:36:11.684000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 467505
2025-12-04T12:42:03.8126626Z I1204 12:36:11.684000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 467506
2025-12-04T12:42:03.8126776Z I1204 12:36:11.685000 467435 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 467507
2025-12-04T12:42:03.8127458Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8127502Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8128227Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8128272Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8128956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8128999Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8129667Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8129735Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8130232Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8130280Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8130770Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8130818Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8131312Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8131358Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8131844Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8131891Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8132025Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8132181Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8132463Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8132620Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8132899Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8133016Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8133297Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8133440Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8133711Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8133860Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8134146Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8134275Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8134547Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8134689Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8135246Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8135358Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8135547Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8136008Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8136119Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8136322Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8136482Z E1204 12:36:19.102000 467504 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8136614Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8136767Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8137055Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8137203Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8137480Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8137606Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8137875Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8138016Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8138335Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8138487Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8138758Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8138886Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8139158Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8139299Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8139853Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8139961Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8140151Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8140608Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8140716Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8140919Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8141077Z E1204 12:36:19.107000 467507 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8141218Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8141372Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8141651Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8141798Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8142085Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8142200Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8142468Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8142623Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8142901Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8143040Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8143308Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8143436Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8143708Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8143849Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8144401Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1249902592 and is now 2843738112.
2025-12-04T12:42:03.8144509Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8144699Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8145163Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8145270Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8145473Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8145638Z E1204 12:36:19.115000 467506 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8145769Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8145922Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8146200Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8146356Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8146633Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8146750Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8147027Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8147180Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8147453Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8147591Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8147859Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8147988Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8148295Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8148436Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8148987Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8149097Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8149287Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8149747Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8149854Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8150070Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8150228Z E1204 12:36:19.163000 467505 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8150270Z FAILED [8.7142s] [100%]
2025-12-04T12:42:03.8150272Z 
2025-12-04T12:42:03.8150329Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8150513Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8150572Z Traceback (most recent call last):
2025-12-04T12:42:03.8150735Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8150779Z     self._join_processes(fn)
2025-12-04T12:42:03.8150951Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8151023Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8151200Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8151258Z     raise RuntimeError(error)
2025-12-04T12:42:03.8151337Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8151384Z Traceback (most recent call last):
2025-12-04T12:42:03.8151544Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8151588Z     getattr(self, test_name)()
2025-12-04T12:42:03.8151745Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8151780Z     fn()
2025-12-04T12:42:03.8151932Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8151974Z     method(*args, **kwargs)
2025-12-04T12:42:03.8152125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8152167Z     method(*args, **kwargs)
2025-12-04T12:42:03.8152317Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8152354Z     with policy():
2025-12-04T12:42:03.8152506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8152548Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8152982Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8152985Z 
2025-12-04T12:42:03.8153059Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8153400Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8153402Z 
2025-12-04T12:42:03.8153489Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8153492Z 
2025-12-04T12:42:03.8153493Z 
2025-12-04T12:42:03.8153572Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8153662Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8153942Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-549814058b19f53a.xml -
2025-12-04T12:42:03.8154005Z =========================== short test summary info ============================
2025-12-04T12:42:03.8154354Z FAILED [8.7142s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8154402Z Traceback (most recent call last):
2025-12-04T12:42:03.8154579Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8154624Z     getattr(self, test_name)()
2025-12-04T12:42:03.8154784Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8154821Z     fn()
2025-12-04T12:42:03.8154973Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8155025Z     method(*args, **kwargs)
2025-12-04T12:42:03.8155188Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8155230Z     method(*args, **kwargs)
2025-12-04T12:42:03.8155379Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8155419Z     with policy():
2025-12-04T12:42:03.8155572Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8155615Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8156050Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8156055Z 
2025-12-04T12:42:03.8156132Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8156475Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8156477Z 
2025-12-04T12:42:03.8156565Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8156629Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8156690Z ======================= 1 failed, 14 deselected in 8.85s =======================
2025-12-04T12:42:03.8156728Z Got exit code 1
2025-12-04T12:42:03.8157013Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8157146Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8157373Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-41e14a589dc61213.xml
2025-12-04T12:42:03.8157435Z ============================= test session starts ==============================
2025-12-04T12:42:03.8157551Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8157593Z cachedir: .pytest_cache
2025-12-04T12:42:03.8157765Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8157810Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8157855Z configfile: pytest.ini
2025-12-04T12:42:03.8158018Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8158419Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8158471Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8158832Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8158892Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8158950Z collected 15 items / 5 deselected / 10 selected
2025-12-04T12:42:03.8159016Z stepcurrent: skipping 5 already run items.
2025-12-04T12:42:03.8159063Z Running 10 items in this shard
2025-12-04T12:42:03.8159065Z 
2025-12-04T12:42:03.8159484Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:36:23.035000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 467906
2025-12-04T12:42:03.8159642Z I1204 12:36:23.036000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 467907
2025-12-04T12:42:03.8159798Z I1204 12:36:23.037000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 467908
2025-12-04T12:42:03.8159950Z I1204 12:36:23.037000 467837 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 467909
2025-12-04T12:42:03.8160637Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8160681Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8161355Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8161403Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8162066Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8162110Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8162792Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8162836Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8163343Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8163391Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8163883Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8163948Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8164449Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8164498Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8164988Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8165037Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8165172Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8165331Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8165613Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8165761Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8166040Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8166157Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8166430Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8166571Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8166841Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8167001Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8167291Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8167421Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8167717Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8167860Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8168465Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2843738112.
2025-12-04T12:42:03.8168601Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8168791Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8169249Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8169359Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8169562Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8169721Z E1204 12:36:30.522000 467909 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8169851Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8170006Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8170288Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8170436Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8170712Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8170830Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8171100Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8171239Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8171521Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8171662Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8171930Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8172069Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8172343Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8172485Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8173047Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8173166Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8173356Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8173812Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8173920Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8174122Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8174282Z E1204 12:36:30.542000 467908 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8174413Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8174565Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8174846Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8174996Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8175273Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8175389Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8175670Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8175810Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8176079Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8176218Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8176498Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8176625Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8176897Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8177047Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8177606Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8177715Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8177904Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8178395Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8178503Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8178707Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8178865Z E1204 12:36:30.578000 467906 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8178995Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8179148Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8179428Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8179576Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8179853Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8179969Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8180252Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8180395Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8180669Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8180821Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8181092Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8181219Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8181501Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8181653Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8182203Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8182314Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8182502Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8182960Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8183068Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8183270Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8183427Z E1204 12:36:30.622000 467907 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8183467Z FAILED [8.6157s] [ 10%]
2025-12-04T12:42:03.8183470Z 
2025-12-04T12:42:03.8183526Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8183709Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8183757Z Traceback (most recent call last):
2025-12-04T12:42:03.8183920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8183965Z     self._join_processes(fn)
2025-12-04T12:42:03.8184137Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8184206Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8184383Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8184429Z     raise RuntimeError(error)
2025-12-04T12:42:03.8184509Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8184555Z Traceback (most recent call last):
2025-12-04T12:42:03.8184715Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8184759Z     getattr(self, test_name)()
2025-12-04T12:42:03.8184926Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8184962Z     fn()
2025-12-04T12:42:03.8185113Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8185154Z     method(*args, **kwargs)
2025-12-04T12:42:03.8185307Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8185358Z     method(*args, **kwargs)
2025-12-04T12:42:03.8185507Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8185556Z     with policy():
2025-12-04T12:42:03.8185707Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8185749Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8186182Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2843738112.
2025-12-04T12:42:03.8186186Z 
2025-12-04T12:42:03.8186260Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8186599Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8186603Z 
2025-12-04T12:42:03.8186690Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8186692Z 
2025-12-04T12:42:03.8186694Z 
2025-12-04T12:42:03.8186772Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8186859Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8187132Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-41e14a589dc61213.xml -
2025-12-04T12:42:03.8187193Z =========================== short test summary info ============================
2025-12-04T12:42:03.8187543Z FAILED [8.6157s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8187590Z Traceback (most recent call last):
2025-12-04T12:42:03.8187756Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8187798Z     getattr(self, test_name)()
2025-12-04T12:42:03.8187958Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8187994Z     fn()
2025-12-04T12:42:03.8188197Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8188239Z     method(*args, **kwargs)
2025-12-04T12:42:03.8188389Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8188430Z     method(*args, **kwargs)
2025-12-04T12:42:03.8188581Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8188621Z     with policy():
2025-12-04T12:42:03.8188774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8188814Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8189261Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2843738112.
2025-12-04T12:42:03.8189263Z 
2025-12-04T12:42:03.8189351Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8189689Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8189706Z 
2025-12-04T12:42:03.8189793Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8189858Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8189921Z ======================= 1 failed, 5 deselected in 8.75s ========================
2025-12-04T12:42:03.8189959Z Got exit code 1
2025-12-04T12:42:03.8189999Z Retrying single test...
2025-12-04T12:42:03.8190229Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d18ec349ed0c91.xml
2025-12-04T12:42:03.8190287Z ============================= test session starts ==============================
2025-12-04T12:42:03.8190400Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8190442Z cachedir: .pytest_cache
2025-12-04T12:42:03.8190600Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8190646Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8190687Z configfile: pytest.ini
2025-12-04T12:42:03.8190851Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8191209Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8191263Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8191608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8191667Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8191722Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8192053Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8192097Z Running 1 items in this shard
2025-12-04T12:42:03.8192099Z 
2025-12-04T12:42:03.8192519Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:36:34.342000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 468308
2025-12-04T12:42:03.8192678Z I1204 12:36:34.343000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 468309
2025-12-04T12:42:03.8192829Z I1204 12:36:34.343000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 468310
2025-12-04T12:42:03.8192988Z I1204 12:36:34.344000 468239 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 468311
2025-12-04T12:42:03.8193669Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8193732Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8194403Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8194445Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8195115Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8195159Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8195654Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8195703Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8196370Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8196414Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8196905Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8196953Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8197454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8197501Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8197998Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8198044Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8198216Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8198386Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8198687Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8198834Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8199115Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8199234Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8199505Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8199649Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8199918Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8200061Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8200329Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8200458Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8200730Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8200871Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8201437Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112.
2025-12-04T12:42:03.8201547Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8201739Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8202198Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8202323Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8202530Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8202687Z E1204 12:36:41.805000 468311 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8202829Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8202992Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8203273Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8203418Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8203697Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8203813Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8204083Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8204223Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8204492Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8204633Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8204900Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8205030Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8205299Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8205441Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8206004Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8206115Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8206305Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8206770Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8206880Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8207083Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8207258Z E1204 12:36:41.814000 468308 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8207389Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8207541Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8207822Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8207968Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8208283Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8208399Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8208670Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8208810Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8209078Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8209218Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8209486Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8209615Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8209886Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8210039Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8210590Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8210699Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8210900Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8211356Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8211476Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8211769Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8211926Z E1204 12:36:41.816000 468310 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8212057Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8212211Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8212492Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8212638Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8212918Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8213032Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8213303Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8213444Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8213712Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8213854Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8214120Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8214248Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8214527Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8216888Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8217452Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8217618Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8217811Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8218317Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8218472Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8218675Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8218834Z E1204 12:36:41.879000 468309 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8218876Z FAILED [8.7139s] [100%]
2025-12-04T12:42:03.8218881Z 
2025-12-04T12:42:03.8218941Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8219124Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8219174Z Traceback (most recent call last):
2025-12-04T12:42:03.8219341Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8219386Z     self._join_processes(fn)
2025-12-04T12:42:03.8219562Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8219616Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8219795Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8219838Z     raise RuntimeError(error)
2025-12-04T12:42:03.8219920Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8219965Z Traceback (most recent call last):
2025-12-04T12:42:03.8220128Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8220173Z     getattr(self, test_name)()
2025-12-04T12:42:03.8220333Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8220367Z     fn()
2025-12-04T12:42:03.8220518Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8220559Z     method(*args, **kwargs)
2025-12-04T12:42:03.8220711Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8220750Z     method(*args, **kwargs)
2025-12-04T12:42:03.8220916Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8220954Z     with policy():
2025-12-04T12:42:03.8221109Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8221150Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8221592Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8221594Z 
2025-12-04T12:42:03.8221686Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8222027Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8222041Z 
2025-12-04T12:42:03.8222132Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8222134Z 
2025-12-04T12:42:03.8222207Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8222254Z Traceback (most recent call last):
2025-12-04T12:42:03.8222416Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8222460Z     getattr(self, test_name)()
2025-12-04T12:42:03.8222619Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8222654Z     fn()
2025-12-04T12:42:03.8222805Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8222844Z     method(*args, **kwargs)
2025-12-04T12:42:03.8222997Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8223036Z     method(*args, **kwargs)
2025-12-04T12:42:03.8223187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8223224Z     with policy():
2025-12-04T12:42:03.8223376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8223416Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8223848Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112.
2025-12-04T12:42:03.8223851Z 
2025-12-04T12:42:03.8223925Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8224269Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8224273Z 
2025-12-04T12:42:03.8224360Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8224363Z 
2025-12-04T12:42:03.8224365Z 
2025-12-04T12:42:03.8224441Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8224530Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8224804Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d18ec349ed0c91.xml -
2025-12-04T12:42:03.8224876Z =========================== short test summary info ============================
2025-12-04T12:42:03.8225223Z FAILED [8.7139s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8225270Z Traceback (most recent call last):
2025-12-04T12:42:03.8225434Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8225477Z     getattr(self, test_name)()
2025-12-04T12:42:03.8225647Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8225683Z     fn()
2025-12-04T12:42:03.8225835Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8225877Z     method(*args, **kwargs)
2025-12-04T12:42:03.8226027Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8226077Z     method(*args, **kwargs)
2025-12-04T12:42:03.8226239Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8226276Z     with policy():
2025-12-04T12:42:03.8226428Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8226468Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8226905Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8226907Z 
2025-12-04T12:42:03.8226980Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8227317Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8227320Z 
2025-12-04T12:42:03.8227408Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8227410Z 
2025-12-04T12:42:03.8227470Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8227515Z Traceback (most recent call last):
2025-12-04T12:42:03.8227677Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8227719Z     getattr(self, test_name)()
2025-12-04T12:42:03.8227877Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8227912Z     fn()
2025-12-04T12:42:03.8228062Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8228104Z     method(*args, **kwargs)
2025-12-04T12:42:03.8228292Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8228331Z     method(*args, **kwargs)
2025-12-04T12:42:03.8228480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8228518Z     with policy():
2025-12-04T12:42:03.8228670Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8228710Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8229156Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1256194048 and is now 2843738112.
2025-12-04T12:42:03.8229160Z 
2025-12-04T12:42:03.8229233Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8229580Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8229584Z 
2025-12-04T12:42:03.8229669Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8229733Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8229797Z ======================= 1 failed, 14 deselected in 8.86s =======================
2025-12-04T12:42:03.8229834Z Got exit code 1
2025-12-04T12:42:03.8229887Z Retrying single test...
2025-12-04T12:42:03.8230113Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1995dd2608cbad94.xml
2025-12-04T12:42:03.8230185Z ============================= test session starts ==============================
2025-12-04T12:42:03.8230299Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8230340Z cachedir: .pytest_cache
2025-12-04T12:42:03.8230500Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8230546Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8230586Z configfile: pytest.ini
2025-12-04T12:42:03.8230751Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8231111Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8231162Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8231508Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8231567Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8231623Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8231953Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8231997Z Running 1 items in this shard
2025-12-04T12:42:03.8231999Z 
2025-12-04T12:42:03.8232405Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:36:45.688000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 468710
2025-12-04T12:42:03.8232561Z I1204 12:36:45.689000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 468711
2025-12-04T12:42:03.8232715Z I1204 12:36:45.689000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 468712
2025-12-04T12:42:03.8232866Z I1204 12:36:45.690000 468641 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 468713
2025-12-04T12:42:03.8233559Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8233607Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8234292Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8234347Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8234844Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8234906Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8235585Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8235627Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8236294Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8236337Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8236830Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8236878Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8237364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8237412Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8237912Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8237960Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8238095Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8238294Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8238589Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8238736Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8239014Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8239142Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8239426Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8239567Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8239836Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8239977Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8240246Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8240377Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8240646Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8240788Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8241342Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1107296256 and is now 2843738112.
2025-12-04T12:42:03.8241454Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8241643Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8242100Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8242224Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8242428Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8242588Z E1204 12:36:53.132000 468713 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8242719Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8242880Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8243160Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8243306Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8243592Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8243718Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8243987Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8244127Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8244397Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8244538Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8244806Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8244935Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8245208Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8245350Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8245902Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8246011Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8246201Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8246665Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8246777Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8247062Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8247220Z E1204 12:36:53.155000 468711 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8247366Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8247519Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8247801Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8247958Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8248283Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8248399Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8248665Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8248807Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8249075Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8249216Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8249483Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8249613Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8249882Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8250023Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8250576Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2996830208.
2025-12-04T12:42:03.8250686Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8250895Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8251350Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8251458Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8251671Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8251828Z E1204 12:36:53.186000 468710 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8251958Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8252111Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8252401Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8252560Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8252841Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8252954Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8253223Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8253362Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8253630Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8253768Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8254034Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8254162Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8254430Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8254572Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8255123Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8255241Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8255430Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8255886Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8256006Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8256207Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8256365Z E1204 12:36:53.195000 468712 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8256416Z FAILED [8.7146s] [100%]
2025-12-04T12:42:03.8256418Z 
2025-12-04T12:42:03.8256476Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8256669Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8256715Z Traceback (most recent call last):
2025-12-04T12:42:03.8256877Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8256923Z     self._join_processes(fn)
2025-12-04T12:42:03.8257096Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8257151Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8257332Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8257376Z     raise RuntimeError(error)
2025-12-04T12:42:03.8257457Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8257503Z Traceback (most recent call last):
2025-12-04T12:42:03.8257664Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8257706Z     getattr(self, test_name)()
2025-12-04T12:42:03.8257865Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8257899Z     fn()
2025-12-04T12:42:03.8258050Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8258090Z     method(*args, **kwargs)
2025-12-04T12:42:03.8258285Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8258325Z     method(*args, **kwargs)
2025-12-04T12:42:03.8258475Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8258513Z     with policy():
2025-12-04T12:42:03.8258666Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8258706Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8259138Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8259141Z 
2025-12-04T12:42:03.8259231Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8259569Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8259573Z 
2025-12-04T12:42:03.8259661Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8259664Z 
2025-12-04T12:42:03.8259665Z 
2025-12-04T12:42:03.8259741Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8259842Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8260117Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1995dd2608cbad94.xml -
2025-12-04T12:42:03.8260179Z =========================== short test summary info ============================
2025-12-04T12:42:03.8260538Z FAILED [8.7146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8260597Z Traceback (most recent call last):
2025-12-04T12:42:03.8260762Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8260803Z     getattr(self, test_name)()
2025-12-04T12:42:03.8260964Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8260998Z     fn()
2025-12-04T12:42:03.8261149Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8261190Z     method(*args, **kwargs)
2025-12-04T12:42:03.8261340Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8261379Z     method(*args, **kwargs)
2025-12-04T12:42:03.8261530Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8261566Z     with policy():
2025-12-04T12:42:03.8261718Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8261757Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8262192Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2843738112.
2025-12-04T12:42:03.8262195Z 
2025-12-04T12:42:03.8262268Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8262606Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8262609Z 
2025-12-04T12:42:03.8262697Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8262759Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8262821Z ======================= 1 failed, 14 deselected in 8.85s =======================
2025-12-04T12:42:03.8262858Z Got exit code 1
2025-12-04T12:42:03.8263153Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8263282Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8263511Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-30cedd8bdfe9b548.xml
2025-12-04T12:42:03.8263570Z ============================= test session starts ==============================
2025-12-04T12:42:03.8263682Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8263722Z cachedir: .pytest_cache
2025-12-04T12:42:03.8263893Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8263939Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8263980Z configfile: pytest.ini
2025-12-04T12:42:03.8264143Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8264512Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8264580Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8264925Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8264982Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8265035Z collected 15 items / 6 deselected / 9 selected
2025-12-04T12:42:03.8265087Z stepcurrent: skipping 6 already run items.
2025-12-04T12:42:03.8265130Z Running 9 items in this shard
2025-12-04T12:42:03.8265133Z 
2025-12-04T12:42:03.8265541Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:36:56.876000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 469112
2025-12-04T12:42:03.8265697Z I1204 12:36:56.876000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 469113
2025-12-04T12:42:03.8265849Z I1204 12:36:56.877000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 469114
2025-12-04T12:42:03.8265999Z I1204 12:36:56.878000 469043 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 469115
2025-12-04T12:42:03.8266685Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8266730Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8267398Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8267441Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8268141Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8268252Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8268938Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8268979Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8269493Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8269555Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8270047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8270097Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8270582Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8270629Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8271114Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8271160Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8271295Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8271450Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8271735Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8271881Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8272159Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8272289Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8272559Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8272701Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8272981Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8273121Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8273389Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8273528Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8273810Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8273951Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8274507Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8274617Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8274807Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8275264Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8275371Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8275573Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8275732Z E1204 12:37:04.309000 469112 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8275866Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8276019Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8276300Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8276447Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8276735Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8276850Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8277119Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8277270Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8277539Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8277680Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8277960Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8278099Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8278414Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8278556Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8279110Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440.
2025-12-04T12:42:03.8279218Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8279407Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8279863Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8279971Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8280174Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8280334Z E1204 12:37:04.316000 469115 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8280463Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8280618Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8280911Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8281058Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8281336Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8281451Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8281732Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8281872Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8282141Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8282301Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8282583Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8282712Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8282981Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8283121Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8283674Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8283782Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8283971Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8284428Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8284536Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8284737Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8284894Z E1204 12:37:04.322000 469114 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8285024Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8285177Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8285464Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8285611Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8285888Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8286012Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8286281Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8286420Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8286698Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8286847Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8287115Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8287243Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8287512Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8287654Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8288386Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8288494Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8288684Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8289139Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8289248Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8289453Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8289609Z E1204 12:37:04.389000 469113 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8289648Z FAILED [8.7146s] [ 11%]
2025-12-04T12:42:03.8289667Z 
2025-12-04T12:42:03.8289724Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8289907Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8289955Z Traceback (most recent call last):
2025-12-04T12:42:03.8290119Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8290163Z     self._join_processes(fn)
2025-12-04T12:42:03.8290351Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8290404Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8290583Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8290626Z     raise RuntimeError(error)
2025-12-04T12:42:03.8290706Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8290764Z Traceback (most recent call last):
2025-12-04T12:42:03.8290925Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8290983Z     getattr(self, test_name)()
2025-12-04T12:42:03.8291142Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8291176Z     fn()
2025-12-04T12:42:03.8291328Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8291367Z     method(*args, **kwargs)
2025-12-04T12:42:03.8291518Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8291557Z     method(*args, **kwargs)
2025-12-04T12:42:03.8291708Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8291746Z     with policy():
2025-12-04T12:42:03.8291897Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8291938Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8292372Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8292375Z 
2025-12-04T12:42:03.8292449Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8292786Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8292789Z 
2025-12-04T12:42:03.8292877Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8292880Z 
2025-12-04T12:42:03.8292937Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8292983Z Traceback (most recent call last):
2025-12-04T12:42:03.8293145Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8293186Z     getattr(self, test_name)()
2025-12-04T12:42:03.8293346Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8293380Z     fn()
2025-12-04T12:42:03.8293540Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8293583Z     method(*args, **kwargs)
2025-12-04T12:42:03.8293733Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8293772Z     method(*args, **kwargs)
2025-12-04T12:42:03.8293923Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8293960Z     with policy():
2025-12-04T12:42:03.8294111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8294152Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8294593Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440.
2025-12-04T12:42:03.8294595Z 
2025-12-04T12:42:03.8294678Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8295014Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8295025Z 
2025-12-04T12:42:03.8295111Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8295113Z 
2025-12-04T12:42:03.8295115Z 
2025-12-04T12:42:03.8295192Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8295282Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8295555Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-30cedd8bdfe9b548.xml -
2025-12-04T12:42:03.8295615Z =========================== short test summary info ============================
2025-12-04T12:42:03.8295961Z FAILED [8.7146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8296009Z Traceback (most recent call last):
2025-12-04T12:42:03.8296172Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8296215Z     getattr(self, test_name)()
2025-12-04T12:42:03.8296376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8296410Z     fn()
2025-12-04T12:42:03.8296561Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8296603Z     method(*args, **kwargs)
2025-12-04T12:42:03.8296753Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8296794Z     method(*args, **kwargs)
2025-12-04T12:42:03.8296943Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8296979Z     with policy():
2025-12-04T12:42:03.8297130Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8297172Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8297616Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8297620Z 
2025-12-04T12:42:03.8297693Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8298029Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8298032Z 
2025-12-04T12:42:03.8298117Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8298119Z 
2025-12-04T12:42:03.8298233Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8298278Z Traceback (most recent call last):
2025-12-04T12:42:03.8298442Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8298483Z     getattr(self, test_name)()
2025-12-04T12:42:03.8298644Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8298691Z     fn()
2025-12-04T12:42:03.8298842Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8298896Z     method(*args, **kwargs)
2025-12-04T12:42:03.8299046Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8299087Z     method(*args, **kwargs)
2025-12-04T12:42:03.8299236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8299274Z     with policy():
2025-12-04T12:42:03.8299424Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8299466Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8299894Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2820669440.
2025-12-04T12:42:03.8299898Z 
2025-12-04T12:42:03.8299970Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8300305Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8300308Z 
2025-12-04T12:42:03.8300394Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8300458Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8300521Z ======================= 1 failed, 6 deselected in 8.85s ========================
2025-12-04T12:42:03.8300559Z Got exit code 1
2025-12-04T12:42:03.8300599Z Retrying single test...
2025-12-04T12:42:03.8300828Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ab6be7b832dcdd81.xml
2025-12-04T12:42:03.8300884Z ============================= test session starts ==============================
2025-12-04T12:42:03.8300997Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8301039Z cachedir: .pytest_cache
2025-12-04T12:42:03.8301197Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8301241Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8301282Z configfile: pytest.ini
2025-12-04T12:42:03.8301469Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8301829Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8301879Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8302233Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8302291Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8302348Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8302679Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8302734Z Running 1 items in this shard
2025-12-04T12:42:03.8302746Z 
2025-12-04T12:42:03.8303156Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:37:08.412000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 469514
2025-12-04T12:42:03.8303311Z I1204 12:37:08.413000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 469515
2025-12-04T12:42:03.8303463Z I1204 12:37:08.414000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 469516
2025-12-04T12:42:03.8303614Z I1204 12:37:08.415000 469445 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 469517
2025-12-04T12:42:03.8304297Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8304343Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8305014Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8305058Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8305731Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8305772Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8306280Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8306330Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8307010Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8307052Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8307547Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8307614Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8308105Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8308182Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8308670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8308718Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8308852Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8309009Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8309295Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8309441Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8309722Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8309840Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8310111Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8310253Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8310537Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8310680Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8310950Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8311080Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8311364Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8311505Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8312059Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8312192Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8312383Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8312838Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8312947Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8313152Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8313311Z E1204 12:37:15.782000 469514 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8313440Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8313592Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8313871Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8314019Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8314295Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8314409Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8314678Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8314818Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8315105Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8315245Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8315516Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8315668Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8315937Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8316079Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8316640Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8316760Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8316947Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8317403Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8317512Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8317715Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8317875Z E1204 12:37:15.800000 469517 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8318003Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8318188Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8318466Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8318613Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8318931Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8319047Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8319329Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8319469Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8319737Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8319876Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8320157Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8320285Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8320554Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8320708Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8321272Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8321380Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8321568Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8322024Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8322132Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8322333Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8322490Z E1204 12:37:15.808000 469516 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8322621Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8322775Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8323057Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8323204Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8323482Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8323606Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8323876Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8324017Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8324283Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8324432Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8324700Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8324826Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8325106Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8325264Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8325817Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8325924Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8326113Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8326571Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8326678Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8326879Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8327036Z E1204 12:37:15.842000 469515 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8327076Z FAILED [8.6147s] [100%]
2025-12-04T12:42:03.8327079Z 
2025-12-04T12:42:03.8327135Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8327316Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8327363Z Traceback (most recent call last):
2025-12-04T12:42:03.8327525Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8327570Z     self._join_processes(fn)
2025-12-04T12:42:03.8327742Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8327809Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8327986Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8328031Z     raise RuntimeError(error)
2025-12-04T12:42:03.8328112Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8328186Z Traceback (most recent call last):
2025-12-04T12:42:03.8328346Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8328388Z     getattr(self, test_name)()
2025-12-04T12:42:03.8328560Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8328595Z     fn()
2025-12-04T12:42:03.8328747Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8328787Z     method(*args, **kwargs)
2025-12-04T12:42:03.8328937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8328990Z     method(*args, **kwargs)
2025-12-04T12:42:03.8329158Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8329195Z     with policy():
2025-12-04T12:42:03.8329347Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8329386Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8329818Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8329821Z 
2025-12-04T12:42:03.8329895Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8330238Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8330241Z 
2025-12-04T12:42:03.8330328Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8330330Z 
2025-12-04T12:42:03.8330332Z 
2025-12-04T12:42:03.8330409Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8330497Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8330769Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ab6be7b832dcdd81.xml -
2025-12-04T12:42:03.8330830Z =========================== short test summary info ============================
2025-12-04T12:42:03.8331178Z FAILED [8.6147s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8331226Z Traceback (most recent call last):
2025-12-04T12:42:03.8331389Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8331433Z     getattr(self, test_name)()
2025-12-04T12:42:03.8331591Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8331626Z     fn()
2025-12-04T12:42:03.8331788Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8331829Z     method(*args, **kwargs)
2025-12-04T12:42:03.8331981Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8332020Z     method(*args, **kwargs)
2025-12-04T12:42:03.8332168Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8332207Z     with policy():
2025-12-04T12:42:03.8332358Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8332408Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8332843Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8332857Z 
2025-12-04T12:42:03.8332931Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8333269Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8333282Z 
2025-12-04T12:42:03.8333368Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8333515Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8333577Z ======================= 1 failed, 14 deselected in 8.78s =======================
2025-12-04T12:42:03.8333615Z Got exit code 1
2025-12-04T12:42:03.8333654Z Retrying single test...
2025-12-04T12:42:03.8333880Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58416b2f8055c3c6.xml
2025-12-04T12:42:03.8333938Z ============================= test session starts ==============================
2025-12-04T12:42:03.8334053Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8334095Z cachedir: .pytest_cache
2025-12-04T12:42:03.8334252Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8334298Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8334337Z configfile: pytest.ini
2025-12-04T12:42:03.8334503Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8334864Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8334916Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8335261Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8335320Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8335375Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8335705Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8335749Z Running 1 items in this shard
2025-12-04T12:42:03.8335752Z 
2025-12-04T12:42:03.8336168Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:37:19.658000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 469916
2025-12-04T12:42:03.8336326Z I1204 12:37:19.659000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 469917
2025-12-04T12:42:03.8336477Z I1204 12:37:19.660000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 469918
2025-12-04T12:42:03.8336639Z I1204 12:37:19.661000 469847 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 469919
2025-12-04T12:42:03.8337322Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8337384Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8338057Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8338098Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8338632Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8338681Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8339353Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8339395Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8339890Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8339939Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8340429Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8340476Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8341162Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8341205Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8341708Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8341756Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8341893Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8342066Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8342365Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8342514Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8342794Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8342911Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8343180Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8343324Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8343592Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8343733Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8344002Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8344132Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8344403Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8344543Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8345114Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8345227Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8345415Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8345881Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8345990Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8346194Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8346351Z E1204 12:37:27.158000 469916 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8346499Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8346661Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8346941Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8347088Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8347366Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8347483Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8347751Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8347892Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8348196Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8348338Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8348607Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8348736Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8349006Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8349146Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8349713Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1254096896 and is now 2820669440.
2025-12-04T12:42:03.8349822Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8350011Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8350477Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8350586Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8350799Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8350967Z E1204 12:37:27.159000 469918 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8351097Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8351249Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8351530Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8351677Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8351954Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8352072Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8352339Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8352479Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8352746Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8352887Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8353153Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8353281Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8353551Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8353701Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8354252Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8354361Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8354558Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8355013Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8355129Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8355341Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8355497Z E1204 12:37:27.166000 469917 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8355627Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8355777Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8356057Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8356204Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8356482Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8356597Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8356864Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8357004Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8357272Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8357412Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8357679Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8357806Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8358087Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8358255Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8358825Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8358932Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8359122Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8359575Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8359707Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8359910Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8360065Z E1204 12:37:27.202000 469919 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8360104Z FAILED [8.7148s] [100%]
2025-12-04T12:42:03.8360106Z 
2025-12-04T12:42:03.8360162Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8360346Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8360393Z Traceback (most recent call last):
2025-12-04T12:42:03.8360558Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8360601Z     self._join_processes(fn)
2025-12-04T12:42:03.8360774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8360828Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8361008Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8361050Z     raise RuntimeError(error)
2025-12-04T12:42:03.8361133Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8361179Z Traceback (most recent call last):
2025-12-04T12:42:03.8361339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8361382Z     getattr(self, test_name)()
2025-12-04T12:42:03.8361541Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8361576Z     fn()
2025-12-04T12:42:03.8361726Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8361767Z     method(*args, **kwargs)
2025-12-04T12:42:03.8361917Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8361957Z     method(*args, **kwargs)
2025-12-04T12:42:03.8362121Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8362161Z     with policy():
2025-12-04T12:42:03.8362311Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8362353Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8362801Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8362803Z 
2025-12-04T12:42:03.8362880Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8363220Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8363233Z 
2025-12-04T12:42:03.8363321Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8363334Z 
2025-12-04T12:42:03.8363395Z Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8363440Z Traceback (most recent call last):
2025-12-04T12:42:03.8363604Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8363646Z     getattr(self, test_name)()
2025-12-04T12:42:03.8363808Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8363841Z     fn()
2025-12-04T12:42:03.8363992Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8364033Z     method(*args, **kwargs)
2025-12-04T12:42:03.8364184Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8364224Z     method(*args, **kwargs)
2025-12-04T12:42:03.8364376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8364416Z     with policy():
2025-12-04T12:42:03.8364568Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8364608Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8365041Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1254096896 and is now 2820669440.
2025-12-04T12:42:03.8365044Z 
2025-12-04T12:42:03.8365119Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8365455Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8365458Z 
2025-12-04T12:42:03.8365545Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8365547Z 
2025-12-04T12:42:03.8365549Z 
2025-12-04T12:42:03.8365625Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8365714Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8365996Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58416b2f8055c3c6.xml -
2025-12-04T12:42:03.8366058Z =========================== short test summary info ============================
2025-12-04T12:42:03.8366404Z FAILED [8.7148s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8366453Z Traceback (most recent call last):
2025-12-04T12:42:03.8366616Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8366668Z     getattr(self, test_name)()
2025-12-04T12:42:03.8366830Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8366863Z     fn()
2025-12-04T12:42:03.8367017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8367055Z     method(*args, **kwargs)
2025-12-04T12:42:03.8367217Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8367267Z     method(*args, **kwargs)
2025-12-04T12:42:03.8367417Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8367453Z     with policy():
2025-12-04T12:42:03.8367605Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8367646Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8368083Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8368086Z 
2025-12-04T12:42:03.8368196Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8368532Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8368535Z 
2025-12-04T12:42:03.8368623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8368625Z 
2025-12-04T12:42:03.8368684Z Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8368730Z Traceback (most recent call last):
2025-12-04T12:42:03.8368891Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8368934Z     getattr(self, test_name)()
2025-12-04T12:42:03.8369094Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8369129Z     fn()
2025-12-04T12:42:03.8369316Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8369358Z     method(*args, **kwargs)
2025-12-04T12:42:03.8369507Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8369547Z     method(*args, **kwargs)
2025-12-04T12:42:03.8369697Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8369734Z     with policy():
2025-12-04T12:42:03.8369887Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8369929Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8370375Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1254096896 and is now 2820669440.
2025-12-04T12:42:03.8370380Z 
2025-12-04T12:42:03.8370453Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8370802Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8370804Z 
2025-12-04T12:42:03.8370889Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8370953Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8371015Z ======================= 1 failed, 14 deselected in 8.85s =======================
2025-12-04T12:42:03.8371065Z Got exit code 1
2025-12-04T12:42:03.8371348Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8371491Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8371720Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-caa2a66b5eb1de6f.xml
2025-12-04T12:42:03.8371777Z ============================= test session starts ==============================
2025-12-04T12:42:03.8371890Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8371931Z cachedir: .pytest_cache
2025-12-04T12:42:03.8372089Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8372136Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8372176Z configfile: pytest.ini
2025-12-04T12:42:03.8372341Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8372705Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8372756Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8373103Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8373160Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8373216Z collected 15 items / 7 deselected / 8 selected
2025-12-04T12:42:03.8373268Z stepcurrent: skipping 7 already run items.
2025-12-04T12:42:03.8373313Z Running 8 items in this shard
2025-12-04T12:42:03.8373315Z 
2025-12-04T12:42:03.8373721Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:37:30.893000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 470318
2025-12-04T12:42:03.8373879Z I1204 12:37:30.893000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 470319
2025-12-04T12:42:03.8374043Z I1204 12:37:30.894000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 470320
2025-12-04T12:42:03.8374195Z I1204 12:37:30.895000 470249 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 470321
2025-12-04T12:42:03.8374879Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8374932Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8375608Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8375670Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8376339Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8376381Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8376879Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8376929Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8377419Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8377466Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8378184Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8378226Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8378718Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8378764Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8379267Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8379317Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8379451Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8379607Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8379901Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8380051Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8380345Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8380473Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8380744Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8380885Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8381155Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8381295Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8381564Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8381694Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8381967Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8382108Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8382667Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 958398464 and is now 2820669440.
2025-12-04T12:42:03.8382779Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8382970Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8383435Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8383546Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8383748Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8383906Z E1204 12:37:38.388000 470321 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8384047Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8384201Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8384483Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8384642Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8384930Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8385045Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8385314Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8385454Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8385723Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8385862Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8386130Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8386258Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8386529Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8386670Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8387219Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8387329Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8387517Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8387983Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8388093Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8388331Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8388505Z E1204 12:37:38.401000 470318 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8388636Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8388790Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8389082Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8389242Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8389519Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8389635Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8389904Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8390044Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8390312Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8390451Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8390719Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8390848Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8391122Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8391264Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8391814Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440.
2025-12-04T12:42:03.8391934Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8392123Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8392578Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8392686Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8392898Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8393057Z E1204 12:37:38.402000 470320 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8393186Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8393348Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8393643Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8393790Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8394066Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8394182Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8394449Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8394590Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8394859Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8394997Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8395264Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8395392Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8395666Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8395806Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8396364Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8396473Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8396662Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8397126Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8397233Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8397437Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8397605Z E1204 12:37:38.482000 470319 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8397655Z FAILED [8.7134s] [ 12%]
2025-12-04T12:42:03.8397658Z 
2025-12-04T12:42:03.8397715Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8397895Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8397943Z Traceback (most recent call last):
2025-12-04T12:42:03.8398107Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8398192Z     self._join_processes(fn)
2025-12-04T12:42:03.8398386Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8398458Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8398637Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8398682Z     raise RuntimeError(error)
2025-12-04T12:42:03.8398765Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8398811Z Traceback (most recent call last):
2025-12-04T12:42:03.8398971Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8399015Z     getattr(self, test_name)()
2025-12-04T12:42:03.8399173Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8399208Z     fn()
2025-12-04T12:42:03.8399359Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8399402Z     method(*args, **kwargs)
2025-12-04T12:42:03.8399552Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8399594Z     method(*args, **kwargs)
2025-12-04T12:42:03.8399743Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8399781Z     with policy():
2025-12-04T12:42:03.8399934Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8399975Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8400429Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440.
2025-12-04T12:42:03.8400432Z 
2025-12-04T12:42:03.8400509Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8400849Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8400852Z 
2025-12-04T12:42:03.8400938Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8400941Z 
2025-12-04T12:42:03.8401013Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8401059Z Traceback (most recent call last):
2025-12-04T12:42:03.8401224Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8401267Z     getattr(self, test_name)()
2025-12-04T12:42:03.8401428Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8401480Z     fn()
2025-12-04T12:42:03.8401630Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8401687Z     method(*args, **kwargs)
2025-12-04T12:42:03.8401836Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8401876Z     method(*args, **kwargs)
2025-12-04T12:42:03.8402025Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8402064Z     with policy():
2025-12-04T12:42:03.8402215Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8402258Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8402685Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 958398464 and is now 2820669440.
2025-12-04T12:42:03.8402689Z 
2025-12-04T12:42:03.8402763Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8403102Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8403106Z 
2025-12-04T12:42:03.8403192Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8403194Z 
2025-12-04T12:42:03.8403196Z 
2025-12-04T12:42:03.8403273Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8403361Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8403633Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-caa2a66b5eb1de6f.xml -
2025-12-04T12:42:03.8403694Z =========================== short test summary info ============================
2025-12-04T12:42:03.8404041Z FAILED [8.7134s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8404089Z Traceback (most recent call last):
2025-12-04T12:42:03.8404265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8404307Z     getattr(self, test_name)()
2025-12-04T12:42:03.8404469Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8404503Z     fn()
2025-12-04T12:42:03.8404655Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8404693Z     method(*args, **kwargs)
2025-12-04T12:42:03.8404844Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8404884Z     method(*args, **kwargs)
2025-12-04T12:42:03.8405042Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8405081Z     with policy():
2025-12-04T12:42:03.8405233Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8405274Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8405716Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1256194048 and is now 2820669440.
2025-12-04T12:42:03.8405730Z 
2025-12-04T12:42:03.8405805Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8406140Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8406142Z 
2025-12-04T12:42:03.8406228Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8406231Z 
2025-12-04T12:42:03.8406290Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8406335Z Traceback (most recent call last):
2025-12-04T12:42:03.8406498Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8406540Z     getattr(self, test_name)()
2025-12-04T12:42:03.8406701Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8406735Z     fn()
2025-12-04T12:42:03.8406886Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8406925Z     method(*args, **kwargs)
2025-12-04T12:42:03.8407076Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8407115Z     method(*args, **kwargs)
2025-12-04T12:42:03.8407265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8407302Z     with policy():
2025-12-04T12:42:03.8407453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8407494Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8407925Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 958398464 and is now 2820669440.
2025-12-04T12:42:03.8407928Z 
2025-12-04T12:42:03.8408000Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8408390Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8408394Z 
2025-12-04T12:42:03.8408481Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8408546Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8408609Z ======================= 1 failed, 7 deselected in 8.85s ========================
2025-12-04T12:42:03.8408646Z Got exit code 1
2025-12-04T12:42:03.8408686Z Retrying single test...
2025-12-04T12:42:03.8408927Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-79558c292143c3e0.xml
2025-12-04T12:42:03.8408987Z ============================= test session starts ==============================
2025-12-04T12:42:03.8409100Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8409142Z cachedir: .pytest_cache
2025-12-04T12:42:03.8409300Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8409365Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8409419Z configfile: pytest.ini
2025-12-04T12:42:03.8409581Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8409939Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8409990Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8410336Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8410394Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8410449Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8410779Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8410823Z Running 1 items in this shard
2025-12-04T12:42:03.8410825Z 
2025-12-04T12:42:03.8411229Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:37:42.079000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 470720
2025-12-04T12:42:03.8411386Z I1204 12:37:42.079000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 470721
2025-12-04T12:42:03.8411538Z I1204 12:37:42.080000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 470722
2025-12-04T12:42:03.8411690Z I1204 12:37:42.081000 470651 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 470723
2025-12-04T12:42:03.8412370Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8412413Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8413097Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8413141Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8413819Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8413871Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8414541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8414593Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8415093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8415142Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8415633Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8415681Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8416170Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8416216Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8416707Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8416754Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8416889Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8417044Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8417335Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8417483Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8417761Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8417888Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8418198Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8418342Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8418625Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8418777Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8419048Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8419177Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8419447Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8419588Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8420184Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8420295Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8420486Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8420943Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8421052Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8421258Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8421417Z E1204 12:37:49.469000 470723 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8421548Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8421713Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8421992Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8422142Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8422432Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8422549Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8422817Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8422967Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8423246Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8423385Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8423654Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8423783Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8424054Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8424194Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8424743Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8424852Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8425040Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8425496Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8425606Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8425811Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8425977Z E1204 12:37:49.476000 470722 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8426109Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8426262Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8426542Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8426703Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8426982Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8427097Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8427376Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8427528Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8427796Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8427936Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8428240Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8428367Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8428638Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8428778Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8429328Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8429437Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8429625Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8430081Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8430187Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8430408Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8430566Z E1204 12:37:49.481000 470720 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8430698Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8430849Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8431141Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8431288Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8431565Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8431704Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8431971Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8432114Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8432380Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8432522Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8432793Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8432921Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8433191Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8433330Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8433877Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8433985Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8434174Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8434638Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8434745Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8434948Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8435105Z E1204 12:37:49.566000 470721 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8435145Z FAILED [8.6136s] [100%]
2025-12-04T12:42:03.8435147Z 
2025-12-04T12:42:03.8435212Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8435395Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8435444Z Traceback (most recent call last):
2025-12-04T12:42:03.8435609Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8435668Z     self._join_processes(fn)
2025-12-04T12:42:03.8435841Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8435906Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8436083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8436126Z     raise RuntimeError(error)
2025-12-04T12:42:03.8436206Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8436253Z Traceback (most recent call last):
2025-12-04T12:42:03.8436412Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8436455Z     getattr(self, test_name)()
2025-12-04T12:42:03.8436614Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8436651Z     fn()
2025-12-04T12:42:03.8436801Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8436843Z     method(*args, **kwargs)
2025-12-04T12:42:03.8436992Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8437032Z     method(*args, **kwargs)
2025-12-04T12:42:03.8437183Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8437221Z     with policy():
2025-12-04T12:42:03.8437372Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8437413Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8437844Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8437848Z 
2025-12-04T12:42:03.8437922Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8438292Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8438295Z 
2025-12-04T12:42:03.8438381Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8438383Z 
2025-12-04T12:42:03.8438385Z 
2025-12-04T12:42:03.8438477Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8438565Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8438835Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-79558c292143c3e0.xml -
2025-12-04T12:42:03.8438895Z =========================== short test summary info ============================
2025-12-04T12:42:03.8439252Z FAILED [8.6136s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8439300Z Traceback (most recent call last):
2025-12-04T12:42:03.8439463Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8439506Z     getattr(self, test_name)()
2025-12-04T12:42:03.8439666Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8439713Z     fn()
2025-12-04T12:42:03.8439878Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8439918Z     method(*args, **kwargs)
2025-12-04T12:42:03.8440067Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8440106Z     method(*args, **kwargs)
2025-12-04T12:42:03.8440255Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8440293Z     with policy():
2025-12-04T12:42:03.8440443Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8440484Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8440914Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8440918Z 
2025-12-04T12:42:03.8440992Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8441328Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8441330Z 
2025-12-04T12:42:03.8441416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8441479Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8441540Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.8441578Z Got exit code 1
2025-12-04T12:42:03.8441618Z Retrying single test...
2025-12-04T12:42:03.8441844Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61029b26705589f5.xml
2025-12-04T12:42:03.8441901Z ============================= test session starts ==============================
2025-12-04T12:42:03.8442015Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8442057Z cachedir: .pytest_cache
2025-12-04T12:42:03.8442215Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8442260Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8442299Z configfile: pytest.ini
2025-12-04T12:42:03.8442480Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8442838Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8442889Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8443244Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8443302Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8443358Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8443685Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8443739Z Running 1 items in this shard
2025-12-04T12:42:03.8443751Z 
2025-12-04T12:42:03.8444155Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:37:53.265000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 471122
2025-12-04T12:42:03.8444310Z I1204 12:37:53.265000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 471123
2025-12-04T12:42:03.8444463Z I1204 12:37:53.266000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 471124
2025-12-04T12:42:03.8444614Z I1204 12:37:53.267000 471053 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 471125
2025-12-04T12:42:03.8445294Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8445339Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8445837Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8445886Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8446557Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8446600Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8447282Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8447326Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8447828Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8447875Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8448578Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:189: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8448647Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8449136Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8449183Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8449676Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8449723Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8449859Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8450014Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8450297Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8450443Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8450721Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8450838Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8451108Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8451251Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8451530Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8451671Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8451939Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8452069Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8452351Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8452493Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8453046Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2820669440.
2025-12-04T12:42:03.8453176Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8453365Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8453820Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8453928Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8454132Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8454291Z E1204 12:38:00.766000 471125 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8454421Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8454573Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8454854Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8455002Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8455279Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8455393Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8455662Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8455801Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8456078Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8462845Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8463162Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8463348Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8463625Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8463773Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8464350Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8464487Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8464681Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8465142Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8465257Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8465462Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8465624Z E1204 12:38:00.772000 471123 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8465757Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8465912Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8466199Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8466349Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8466631Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8466748Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8467036Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8467179Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8467451Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8467593Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8467872Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8468004Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8468323Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8468479Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8469056Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8469169Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8469360Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8469819Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8469929Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8470133Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8470304Z E1204 12:38:00.773000 471122 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8470471Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8470626Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8470909Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8471055Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8471337Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8471466Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8471736Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8471879Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8472148Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8472301Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8472574Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8472714Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8472983Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8473176Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8473754Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2820669440.
2025-12-04T12:42:03.8473862Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8474053Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8474513Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8474620Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8474824Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8474982Z E1204 12:38:00.832000 471124 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8475025Z FAILED [8.7134s] [100%]
2025-12-04T12:42:03.8475030Z 
2025-12-04T12:42:03.8475091Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8475274Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8475325Z Traceback (most recent call last):
2025-12-04T12:42:03.8475491Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8475537Z     self._join_processes(fn)
2025-12-04T12:42:03.8475711Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8475784Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8475965Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8476011Z     raise RuntimeError(error)
2025-12-04T12:42:03.8476096Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8476143Z Traceback (most recent call last):
2025-12-04T12:42:03.8476306Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8476349Z     getattr(self, test_name)()
2025-12-04T12:42:03.8476521Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8476557Z     fn()
2025-12-04T12:42:03.8476710Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8476752Z     method(*args, **kwargs)
2025-12-04T12:42:03.8476904Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8476957Z     method(*args, **kwargs)
2025-12-04T12:42:03.8477126Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8477164Z     with policy():
2025-12-04T12:42:03.8477318Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8477360Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8477794Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2820669440.
2025-12-04T12:42:03.8477797Z 
2025-12-04T12:42:03.8477875Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8478263Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8478267Z 
2025-12-04T12:42:03.8478358Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8478360Z 
2025-12-04T12:42:03.8478362Z 
2025-12-04T12:42:03.8478442Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8478532Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8478808Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-61029b26705589f5.xml -
2025-12-04T12:42:03.8478872Z =========================== short test summary info ============================
2025-12-04T12:42:03.8479223Z FAILED [8.7134s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8479271Z Traceback (most recent call last):
2025-12-04T12:42:03.8479439Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8479483Z     getattr(self, test_name)()
2025-12-04T12:42:03.8500205Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8500252Z     fn()
2025-12-04T12:42:03.8500429Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8500471Z     method(*args, **kwargs)
2025-12-04T12:42:03.8500626Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8500666Z     method(*args, **kwargs)
2025-12-04T12:42:03.8500815Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8500852Z     with policy():
2025-12-04T12:42:03.8501004Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8501057Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8501494Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1113587712 and is now 2820669440.
2025-12-04T12:42:03.8501511Z 
2025-12-04T12:42:03.8501588Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8501926Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8501942Z 
2025-12-04T12:42:03.8502032Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8502097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8502161Z ======================= 1 failed, 14 deselected in 8.85s =======================
2025-12-04T12:42:03.8502198Z Got exit code 1
2025-12-04T12:42:03.8502486Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8502615Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8502845Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-abf0463f5e31e225.xml
2025-12-04T12:42:03.8502904Z ============================= test session starts ==============================
2025-12-04T12:42:03.8503018Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8503059Z cachedir: .pytest_cache
2025-12-04T12:42:03.8503219Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8503265Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8503306Z configfile: pytest.ini
2025-12-04T12:42:03.8503471Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8503832Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8503885Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8504231Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8504289Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8504342Z collected 15 items / 8 deselected / 7 selected
2025-12-04T12:42:03.8504405Z stepcurrent: skipping 8 already run items.
2025-12-04T12:42:03.8504449Z Running 7 items in this shard
2025-12-04T12:42:03.8504451Z 
2025-12-04T12:42:03.8504877Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:38:04.522000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 471524
2025-12-04T12:42:03.8505034Z I1204 12:38:04.523000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 471525
2025-12-04T12:42:03.8505197Z I1204 12:38:04.523000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 471526
2025-12-04T12:42:03.8505347Z I1204 12:38:04.524000 471455 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 471527
2025-12-04T12:42:03.8506033Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8506100Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8506771Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8506814Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8507482Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8507524Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8508236Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8508277Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8508779Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8508829Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8509341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8509388Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8509875Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8509922Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8510424Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8510470Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8511158Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8511213Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8511881Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8511923Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8512588Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8512629Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8513296Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8513339Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8513829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8513889Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8514386Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8514444Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8514940Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8514997Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8515484Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8515567Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8515805Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8515850Z   local_shape = tensor.shape
2025-12-04T12:42:03.8516083Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8516126Z   local_shape = tensor.shape
2025-12-04T12:42:03.8516358Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8516395Z   tensor.shape,
2025-12-04T12:42:03.8516626Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8516664Z   tensor.dtype,
2025-12-04T12:42:03.8516894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8516929Z   tensor.shape,
2025-12-04T12:42:03.8517160Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8517195Z   tensor.dtype,
2025-12-04T12:42:03.8517427Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8517468Z   local_shape = tensor.shape
2025-12-04T12:42:03.8517701Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8517737Z   tensor.shape,
2025-12-04T12:42:03.8517968Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8518003Z   tensor.dtype,
2025-12-04T12:42:03.8518272Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8518316Z   local_shape = tensor.shape
2025-12-04T12:42:03.8518559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8518598Z   tensor.shape,
2025-12-04T12:42:03.8518828Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8518865Z   tensor.dtype,
2025-12-04T12:42:03.8519002Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8519173Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8519457Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8519605Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8519898Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8520030Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8520303Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8520444Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8520715Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8520864Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8521162Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8521292Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8521562Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8521704Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8522271Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1260388352 and is now 2850029568.
2025-12-04T12:42:03.8522384Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8522576Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8523058Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8523168Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8523371Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8523531Z E1204 12:38:11.988000 471527 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8523671Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8523825Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8524104Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8524263Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8524553Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8524669Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8524943Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8525084Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8525353Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8525492Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8525761Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8525889Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8526159Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8526301Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8526874Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8526982Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8527183Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8527654Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8527763Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8527975Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8528133Z E1204 12:38:12.000000 471526 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8528298Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8528453Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8528744Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8528906Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8529187Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8529302Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8529570Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8529711Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8529980Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8530119Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8530386Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8530515Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8530784Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8530925Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8531487Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8531607Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8531796Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8532265Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8532392Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8532594Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8532752Z E1204 12:38:12.022000 471525 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8532893Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8533055Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8533333Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8533481Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8533759Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8533876Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8534145Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8534285Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8534554Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8534694Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8534964Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8535091Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8535362Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8535504Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8536080Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664.
2025-12-04T12:42:03.8536189Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8536378Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8536858Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8536968Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8537170Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8537338Z E1204 12:38:12.046000 471524 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8537388Z FAILED [8.7151s] [ 14%]
2025-12-04T12:42:03.8537391Z 
2025-12-04T12:42:03.8537449Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8537644Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8537694Z Traceback (most recent call last):
2025-12-04T12:42:03.8537856Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8537900Z     self._join_processes(fn)
2025-12-04T12:42:03.8538073Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8538129Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8538350Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8538395Z     raise RuntimeError(error)
2025-12-04T12:42:03.8538474Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8538521Z Traceback (most recent call last):
2025-12-04T12:42:03.8538684Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8538727Z     getattr(self, test_name)()
2025-12-04T12:42:03.8538889Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8538924Z     fn()
2025-12-04T12:42:03.8539076Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8539117Z     method(*args, **kwargs)
2025-12-04T12:42:03.8539269Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8539309Z     method(*args, **kwargs)
2025-12-04T12:42:03.8539459Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8539495Z     with policy():
2025-12-04T12:42:03.8539647Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8539688Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8540150Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8540154Z 
2025-12-04T12:42:03.8540229Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8540582Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8540584Z 
2025-12-04T12:42:03.8540687Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8540689Z 
2025-12-04T12:42:03.8540691Z 
2025-12-04T12:42:03.8540767Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8540858Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8541134Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-abf0463f5e31e225.xml -
2025-12-04T12:42:03.8541230Z =========================== short test summary info ============================
2025-12-04T12:42:03.8541589Z FAILED [8.7151s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8541636Z Traceback (most recent call last):
2025-12-04T12:42:03.8541800Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8541844Z     getattr(self, test_name)()
2025-12-04T12:42:03.8542005Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8542041Z     fn()
2025-12-04T12:42:03.8542193Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8542234Z     method(*args, **kwargs)
2025-12-04T12:42:03.8542385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8542424Z     method(*args, **kwargs)
2025-12-04T12:42:03.8542575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8542612Z     with policy():
2025-12-04T12:42:03.8542764Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8542804Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8543253Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8543257Z 
2025-12-04T12:42:03.8543332Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8543683Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8543685Z 
2025-12-04T12:42:03.8543773Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8543837Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8543910Z ======================= 1 failed, 8 deselected in 8.85s ========================
2025-12-04T12:42:03.8543946Z Got exit code 1
2025-12-04T12:42:03.8543987Z Retrying single test...
2025-12-04T12:42:03.8544213Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-75b1f48c62bfaf0e.xml
2025-12-04T12:42:03.8544272Z ============================= test session starts ==============================
2025-12-04T12:42:03.8544385Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8544426Z cachedir: .pytest_cache
2025-12-04T12:42:03.8544597Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8544644Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8544683Z configfile: pytest.ini
2025-12-04T12:42:03.8544847Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8545204Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8545280Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8545629Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8545688Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8545745Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8546083Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8546128Z Running 1 items in this shard
2025-12-04T12:42:03.8546131Z 
2025-12-04T12:42:03.8546548Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:38:15.815000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 471926
2025-12-04T12:42:03.8546704Z I1204 12:38:15.816000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 471927
2025-12-04T12:42:03.8546855Z I1204 12:38:15.816000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 471928
2025-12-04T12:42:03.8547008Z I1204 12:38:15.817000 471857 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 471929
2025-12-04T12:42:03.8547689Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8547735Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8548467Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8548510Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8549181Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8549251Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8549918Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8549986Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8550485Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8550535Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8551028Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8551075Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8551567Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8551613Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8552101Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8552147Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8552823Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8552865Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8553541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8553585Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8554266Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8554307Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8554798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8554878Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8555364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8555422Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8556092Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8556135Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8556623Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8556680Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8557164Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8557223Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8557460Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8557504Z   local_shape = tensor.shape
2025-12-04T12:42:03.8557739Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8557775Z   tensor.shape,
2025-12-04T12:42:03.8558018Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8558061Z   local_shape = tensor.shape
2025-12-04T12:42:03.8558334Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8558370Z   tensor.dtype,
2025-12-04T12:42:03.8558618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8558654Z   tensor.shape,
2025-12-04T12:42:03.8558885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8558921Z   tensor.dtype,
2025-12-04T12:42:03.8559153Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8559210Z   local_shape = tensor.shape
2025-12-04T12:42:03.8559454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8559492Z   tensor.shape,
2025-12-04T12:42:03.8559723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8559759Z   tensor.dtype,
2025-12-04T12:42:03.8559988Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8560032Z   local_shape = tensor.shape
2025-12-04T12:42:03.8560261Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8560300Z   tensor.shape,
2025-12-04T12:42:03.8560529Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8560566Z   tensor.dtype,
2025-12-04T12:42:03.8560701Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8560857Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8561141Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8561290Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8561570Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8561688Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8561959Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8562100Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8562380Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8562520Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8562790Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8562930Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8563201Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8563343Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8563919Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1256194048 and is now 2850029568.
2025-12-04T12:42:03.8564047Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8564235Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8564706Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8564816Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8565022Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8565185Z E1204 12:38:23.234000 471929 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8565315Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8565469Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8565748Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8565896Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8566173Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8566288Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8566571Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8566712Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8566980Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8567119Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8567399Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8567528Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8567798Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8567962Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8568576Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664.
2025-12-04T12:42:03.8568686Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8568877Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8569348Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8569458Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8569662Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8569821Z E1204 12:38:23.244000 471926 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8569951Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8570109Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8570389Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8570538Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8570818Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8570947Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8571220Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8571363Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8571690Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8571829Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8572100Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8572241Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8572514Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8572669Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8573230Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8573338Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8573527Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8573998Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8574107Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8574309Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8574467Z E1204 12:38:23.246000 471928 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8574597Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8574749Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8575028Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8575175Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8575464Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8575581Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8575853Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8575993Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8576272Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8576411Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8576691Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8576827Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8577102Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8577245Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8577806Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8577916Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8578103Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8578618Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8578726Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8578928Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8579086Z E1204 12:38:23.253000 471927 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8579125Z FAILED [8.6140s] [100%]
2025-12-04T12:42:03.8579128Z 
2025-12-04T12:42:03.8579185Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8579379Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8579426Z Traceback (most recent call last):
2025-12-04T12:42:03.8579603Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8579647Z     self._join_processes(fn)
2025-12-04T12:42:03.8579821Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8579877Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8580054Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8580098Z     raise RuntimeError(error)
2025-12-04T12:42:03.8580177Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8580238Z Traceback (most recent call last):
2025-12-04T12:42:03.8580399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8580442Z     getattr(self, test_name)()
2025-12-04T12:42:03.8580601Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8580655Z     fn()
2025-12-04T12:42:03.8580807Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8580862Z     method(*args, **kwargs)
2025-12-04T12:42:03.8581013Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8581054Z     method(*args, **kwargs)
2025-12-04T12:42:03.8581207Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8581245Z     with policy():
2025-12-04T12:42:03.8581397Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8581439Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8581886Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8581890Z 
2025-12-04T12:42:03.8581966Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8582315Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8582317Z 
2025-12-04T12:42:03.8582407Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8582409Z 
2025-12-04T12:42:03.8582411Z 
2025-12-04T12:42:03.8582489Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8582577Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8582849Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-75b1f48c62bfaf0e.xml -
2025-12-04T12:42:03.8582911Z =========================== short test summary info ============================
2025-12-04T12:42:03.8583269Z FAILED [8.6140s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8583316Z Traceback (most recent call last):
2025-12-04T12:42:03.8583480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8583534Z     getattr(self, test_name)()
2025-12-04T12:42:03.8583694Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8583731Z     fn()
2025-12-04T12:42:03.8583882Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8583926Z     method(*args, **kwargs)
2025-12-04T12:42:03.8584076Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8584116Z     method(*args, **kwargs)
2025-12-04T12:42:03.8584277Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8584314Z     with policy():
2025-12-04T12:42:03.8584467Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8584508Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8584957Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8584981Z 
2025-12-04T12:42:03.8585056Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8585406Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8585408Z 
2025-12-04T12:42:03.8585495Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8585561Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8585623Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.8585662Z Got exit code 1
2025-12-04T12:42:03.8585702Z Retrying single test...
2025-12-04T12:42:03.8585931Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c5295aa9ee49c749.xml
2025-12-04T12:42:03.8585991Z ============================= test session starts ==============================
2025-12-04T12:42:03.8586103Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8586146Z cachedir: .pytest_cache
2025-12-04T12:42:03.8586301Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8586347Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8586387Z configfile: pytest.ini
2025-12-04T12:42:03.8586554Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8586911Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8586962Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8587310Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8587368Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8587424Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8587774Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8587820Z Running 1 items in this shard
2025-12-04T12:42:03.8587823Z 
2025-12-04T12:42:03.8588276Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda I1204 12:38:26.884000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 472328
2025-12-04T12:42:03.8588446Z I1204 12:38:26.885000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 472329
2025-12-04T12:42:03.8588598Z I1204 12:38:26.886000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 472330
2025-12-04T12:42:03.8588750Z I1204 12:38:26.886000 472259 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 472331
2025-12-04T12:42:03.8589446Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8589505Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8590181Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8590224Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8590894Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8590936Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8591603Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8591646Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8592144Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8592196Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8592700Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8592750Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8593248Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8593295Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8593780Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8593848Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8594522Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8594565Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8595240Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8595284Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8595950Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8595993Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8596483Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8596543Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8597031Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8597099Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8597774Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8597817Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8598356Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8598414Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8598909Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8598983Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8599223Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8599267Z   local_shape = tensor.shape
2025-12-04T12:42:03.8599504Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8599546Z   local_shape = tensor.shape
2025-12-04T12:42:03.8599781Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8599818Z   tensor.shape,
2025-12-04T12:42:03.8600052Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8600091Z   tensor.shape,
2025-12-04T12:42:03.8600323Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8600360Z   tensor.dtype,
2025-12-04T12:42:03.8600593Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8600630Z   tensor.dtype,
2025-12-04T12:42:03.8600865Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8600907Z   local_shape = tensor.shape
2025-12-04T12:42:03.8601140Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8601177Z   tensor.shape,
2025-12-04T12:42:03.8601411Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8601450Z   tensor.dtype,
2025-12-04T12:42:03.8601701Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8601745Z   local_shape = tensor.shape
2025-12-04T12:42:03.8601975Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8602014Z   tensor.shape,
2025-12-04T12:42:03.8602245Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8602281Z   tensor.dtype,
2025-12-04T12:42:03.8602429Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8602587Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8602916Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8603076Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8603370Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8603487Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8603758Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8603899Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8604169Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8604309Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8604579Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8604709Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8604979Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8605121Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8605688Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2850029568.
2025-12-04T12:42:03.8605799Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8606009Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8606478Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8606590Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8606804Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8606962Z E1204 12:38:34.338000 472330 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8607093Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8607245Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8607535Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8607696Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8607975Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8608092Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8608399Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8608540Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8608810Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8608949Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8609220Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8609349Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8609622Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8609765Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8610327Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664.
2025-12-04T12:42:03.8610451Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8610642Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8611109Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8611229Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8611432Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8611591Z E1204 12:38:34.378000 472328 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8611734Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8611900Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8612180Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8612327Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8612604Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8612719Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8612990Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8613130Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8613396Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8613534Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8613803Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8613931Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8614201Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8614342Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8614921Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8615032Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8615222Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8615698Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8615807Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8616008Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8616177Z E1204 12:38:34.382000 472331 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8616319Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8616473Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8616756Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8616903Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8617183Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8617299Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8617571Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8617711Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8617980Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8618118Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8618432Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8618560Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8618832Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8618972Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8619549Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8619660Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8619868Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8620338Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8620459Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8620674Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8620831Z E1204 12:38:34.406000 472329 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8620870Z FAILED [8.6165s] [100%]
2025-12-04T12:42:03.8620872Z 
2025-12-04T12:42:03.8620932Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8621125Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8621174Z Traceback (most recent call last):
2025-12-04T12:42:03.8621335Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8621381Z     self._join_processes(fn)
2025-12-04T12:42:03.8621553Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8621610Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8621787Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8621831Z     raise RuntimeError(error)
2025-12-04T12:42:03.8621911Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8621960Z Traceback (most recent call last):
2025-12-04T12:42:03.8622158Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8622203Z     getattr(self, test_name)()
2025-12-04T12:42:03.8622363Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8622399Z     fn()
2025-12-04T12:42:03.8622551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8622592Z     method(*args, **kwargs)
2025-12-04T12:42:03.8622743Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8622784Z     method(*args, **kwargs)
2025-12-04T12:42:03.8622936Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8622973Z     with policy():
2025-12-04T12:42:03.8623125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8623181Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8623627Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2850029568.
2025-12-04T12:42:03.8623632Z 
2025-12-04T12:42:03.8623707Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8624067Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8624070Z 
2025-12-04T12:42:03.8624158Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8624163Z 
2025-12-04T12:42:03.8624166Z 
2025-12-04T12:42:03.8624242Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8624342Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8624626Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c5295aa9ee49c749.xml -
2025-12-04T12:42:03.8624687Z =========================== short test summary info ============================
2025-12-04T12:42:03.8625045Z FAILED [8.6165s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8625092Z Traceback (most recent call last):
2025-12-04T12:42:03.8625256Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8625300Z     getattr(self, test_name)()
2025-12-04T12:42:03.8625459Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8625497Z     fn()
2025-12-04T12:42:03.8625648Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8625689Z     method(*args, **kwargs)
2025-12-04T12:42:03.8625839Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8625880Z     method(*args, **kwargs)
2025-12-04T12:42:03.8626029Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8626066Z     with policy():
2025-12-04T12:42:03.8626219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8626260Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8626704Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2850029568.
2025-12-04T12:42:03.8626708Z 
2025-12-04T12:42:03.8626781Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8627132Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8627134Z 
2025-12-04T12:42:03.8627231Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8627297Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8627361Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.8627400Z Got exit code 1
2025-12-04T12:42:03.8627696Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8627825Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8628065Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-0d50af9fa3a75953.xml
2025-12-04T12:42:03.8628125Z ============================= test session starts ==============================
2025-12-04T12:42:03.8628278Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8628337Z cachedir: .pytest_cache
2025-12-04T12:42:03.8628496Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8628566Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8628608Z configfile: pytest.ini
2025-12-04T12:42:03.8628771Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8629132Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8629183Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8629533Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8629590Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8629646Z collected 15 items / 9 deselected / 6 selected
2025-12-04T12:42:03.8629698Z stepcurrent: skipping 9 already run items.
2025-12-04T12:42:03.8629743Z Running 6 items in this shard
2025-12-04T12:42:03.8629745Z 
2025-12-04T12:42:03.8630165Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:38:38.135000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 472730
2025-12-04T12:42:03.8630320Z I1204 12:38:38.136000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 472731
2025-12-04T12:42:03.8630474Z I1204 12:38:38.136000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 472732
2025-12-04T12:42:03.8630624Z I1204 12:38:38.137000 472661 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 472733
2025-12-04T12:42:03.8631309Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8631353Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8632039Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8632084Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8632765Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8632809Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8633479Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8633542Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8634042Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8634091Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8634586Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8634634Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8635125Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8635172Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8635658Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8635705Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8636388Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8636431Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8637099Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8637154Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8637824Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8637877Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8638413Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8638475Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8638958Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8639017Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8639692Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8639733Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8640221Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8640278Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8640762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8640820Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8641071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8641116Z   local_shape = tensor.shape
2025-12-04T12:42:03.8641352Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8641396Z   local_shape = tensor.shape
2025-12-04T12:42:03.8641628Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8641667Z   tensor.shape,
2025-12-04T12:42:03.8641910Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8641949Z   tensor.shape,
2025-12-04T12:42:03.8642182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8642226Z   local_shape = tensor.shape
2025-12-04T12:42:03.8642470Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8642523Z   tensor.dtype,
2025-12-04T12:42:03.8642753Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8642790Z   tensor.dtype,
2025-12-04T12:42:03.8643021Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8643056Z   tensor.shape,
2025-12-04T12:42:03.8643288Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8643323Z   tensor.dtype,
2025-12-04T12:42:03.8643556Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8643598Z   local_shape = tensor.shape
2025-12-04T12:42:03.8643832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8643868Z   tensor.shape,
2025-12-04T12:42:03.8644099Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8644135Z   tensor.dtype,
2025-12-04T12:42:03.8644271Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8644427Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8644713Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8644861Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8645140Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8645257Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8645538Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8645682Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8645950Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8646101Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8646369Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8646499Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8646796Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8646949Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8647517Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2850029568.
2025-12-04T12:42:03.8647627Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8647817Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8648329Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8648437Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8648641Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8648799Z E1204 12:38:48.499000 472733 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8648933Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8649086Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8649369Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8649516Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8649808Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8649924Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8650192Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8650333Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8650613Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8650753Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8651021Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8651165Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8651450Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8651593Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8652158Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8652265Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8652455Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8652921Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8653029Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8653233Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8653390Z E1204 12:38:48.512000 472731 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8653521Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8653672Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8653951Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8654108Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8654390Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8654507Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8654776Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8654927Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8655195Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8655334Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8655613Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8655754Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8656024Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8656166Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8656729Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8656838Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8657028Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8657493Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8657601Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8657802Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8657959Z E1204 12:38:48.513000 472732 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8658089Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8658274Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8658566Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8658712Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8658989Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8659102Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8659386Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8659527Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8659794Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8659961Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8660227Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8660356Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8660626Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8660766Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8661323Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664.
2025-12-04T12:42:03.8661432Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8661622Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8662089Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8662197Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8662397Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8662555Z E1204 12:38:48.545000 472730 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8662598Z FAILED [11.5166s] [ 16%]
2025-12-04T12:42:03.8662600Z 
2025-12-04T12:42:03.8662657Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8662858Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8662906Z Traceback (most recent call last):
2025-12-04T12:42:03.8663070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8663113Z     self._join_processes(fn)
2025-12-04T12:42:03.8663287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8663341Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8663537Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8663581Z     raise RuntimeError(error)
2025-12-04T12:42:03.8663662Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8663709Z Traceback (most recent call last):
2025-12-04T12:42:03.8663873Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8663926Z     getattr(self, test_name)()
2025-12-04T12:42:03.8664096Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8664131Z     fn()
2025-12-04T12:42:03.8664287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8664328Z     method(*args, **kwargs)
2025-12-04T12:42:03.8664480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8664519Z     method(*args, **kwargs)
2025-12-04T12:42:03.8664672Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8664712Z     with policy():
2025-12-04T12:42:03.8664863Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8664906Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8665348Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8665351Z 
2025-12-04T12:42:03.8665428Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8665777Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8665779Z 
2025-12-04T12:42:03.8665869Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8665871Z 
2025-12-04T12:42:03.8665873Z 
2025-12-04T12:42:03.8665950Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8666037Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8666316Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-0d50af9fa3a75953.xml -
2025-12-04T12:42:03.8666378Z =========================== short test summary info ============================
2025-12-04T12:42:03.8669067Z FAILED [11.5166s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8669115Z Traceback (most recent call last):
2025-12-04T12:42:03.8669282Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8669326Z     getattr(self, test_name)()
2025-12-04T12:42:03.8669489Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8669523Z     fn()
2025-12-04T12:42:03.8669675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8669734Z     method(*args, **kwargs)
2025-12-04T12:42:03.8669886Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8669925Z     method(*args, **kwargs)
2025-12-04T12:42:03.8670078Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8670130Z     with policy():
2025-12-04T12:42:03.8670282Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8670338Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8670781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8670784Z 
2025-12-04T12:42:03.8670860Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8671210Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8671213Z 
2025-12-04T12:42:03.8671303Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8671368Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8671432Z ======================= 1 failed, 9 deselected in 11.65s =======================
2025-12-04T12:42:03.8671469Z Got exit code 1
2025-12-04T12:42:03.8671509Z Retrying single test...
2025-12-04T12:42:03.8671782Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a2335104fd924581.xml
2025-12-04T12:42:03.8671841Z ============================= test session starts ==============================
2025-12-04T12:42:03.8671957Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8671998Z cachedir: .pytest_cache
2025-12-04T12:42:03.8672156Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8672203Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8672244Z configfile: pytest.ini
2025-12-04T12:42:03.8672407Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8672803Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8672855Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8673216Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8673276Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8673335Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8673673Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8673719Z Running 1 items in this shard
2025-12-04T12:42:03.8673721Z 
2025-12-04T12:42:03.8674149Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:38:52.237000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 473132
2025-12-04T12:42:03.8674304Z I1204 12:38:52.238000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 473133
2025-12-04T12:42:03.8674466Z I1204 12:38:52.238000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 473134
2025-12-04T12:42:03.8674628Z I1204 12:38:52.239000 473063 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 473135
2025-12-04T12:42:03.8675309Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8675354Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8676028Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8676073Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8676743Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8676786Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8677452Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8677494Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8678007Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8678056Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8678569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8678618Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8679127Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8679187Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8679670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8679731Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8680408Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8680452Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8681125Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8681167Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8681834Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8681878Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8682366Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8682426Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8682931Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8682993Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8683676Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8683717Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8684204Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8684278Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8684516Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8684559Z   local_shape = tensor.shape
2025-12-04T12:42:03.8684794Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8684831Z   tensor.shape,
2025-12-04T12:42:03.8685065Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8685108Z   local_shape = tensor.shape
2025-12-04T12:42:03.8685341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8685380Z   tensor.dtype,
2025-12-04T12:42:03.8685612Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8685649Z   tensor.shape,
2025-12-04T12:42:03.8685882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8685919Z   tensor.dtype,
2025-12-04T12:42:03.8686405Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8686465Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8686697Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8686740Z   local_shape = tensor.shape
2025-12-04T12:42:03.8686971Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8687007Z   tensor.shape,
2025-12-04T12:42:03.8687249Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8687285Z   tensor.dtype,
2025-12-04T12:42:03.8687517Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8687559Z   local_shape = tensor.shape
2025-12-04T12:42:03.8687790Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8687837Z   tensor.shape,
2025-12-04T12:42:03.8688069Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8688104Z   tensor.dtype,
2025-12-04T12:42:03.8688278Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8688447Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8688746Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8688892Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8689173Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8689291Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8689559Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8689702Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8689971Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8690112Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8690380Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8690512Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8690785Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8690925Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8691503Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1107296256 and is now 2850029568.
2025-12-04T12:42:03.8691613Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8691808Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8692288Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8692397Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8692602Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8692759Z E1204 12:38:59.633000 473135 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8692902Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8693064Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8693349Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8693495Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8693777Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8693894Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8694162Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8694303Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8694572Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8694713Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8694980Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8695114Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8695383Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8695529Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8696105Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664.
2025-12-04T12:42:03.8696214Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8696404Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8696878Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8696987Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8697200Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8697372Z E1204 12:38:59.645000 473132 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8697504Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8697657Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8697938Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8698084Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8698403Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8698520Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8698792Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8698933Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8699201Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8699343Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8699612Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8699745Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8700017Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8700177Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8700744Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8700853Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8701057Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8701523Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8701644Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8701861Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8702018Z E1204 12:38:59.653000 473134 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8702152Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8702306Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8702589Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8702738Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8703021Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8703137Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8703408Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8703552Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8703821Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8703964Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8704233Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8704362Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8704643Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8704787Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8705363Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8705471Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8705667Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8706133Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8706261Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8706463Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8706625Z E1204 12:38:59.656000 473133 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8706669Z FAILED [8.5134s] [100%]
2025-12-04T12:42:03.8706671Z 
2025-12-04T12:42:03.8706730Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8706924Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8706974Z Traceback (most recent call last):
2025-12-04T12:42:03.8707141Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8707185Z     self._join_processes(fn)
2025-12-04T12:42:03.8707360Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8707414Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8707597Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8707642Z     raise RuntimeError(error)
2025-12-04T12:42:03.8707726Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8707775Z Traceback (most recent call last):
2025-12-04T12:42:03.8707941Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8707987Z     getattr(self, test_name)()
2025-12-04T12:42:03.8708188Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8708223Z     fn()
2025-12-04T12:42:03.8708377Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8708417Z     method(*args, **kwargs)
2025-12-04T12:42:03.8708570Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8708610Z     method(*args, **kwargs)
2025-12-04T12:42:03.8708775Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8708817Z     with policy():
2025-12-04T12:42:03.8708968Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8709015Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8709469Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1107296256 and is now 2850029568.
2025-12-04T12:42:03.8709472Z 
2025-12-04T12:42:03.8709551Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8709900Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8709916Z 
2025-12-04T12:42:03.8710009Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8710023Z 
2025-12-04T12:42:03.8710025Z 
2025-12-04T12:42:03.8710104Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8710191Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8710466Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a2335104fd924581.xml -
2025-12-04T12:42:03.8710528Z =========================== short test summary info ============================
2025-12-04T12:42:03.8710885Z FAILED [8.5134s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8710932Z Traceback (most recent call last):
2025-12-04T12:42:03.8711104Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8711147Z     getattr(self, test_name)()
2025-12-04T12:42:03.8711311Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8711345Z     fn()
2025-12-04T12:42:03.8711498Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8711540Z     method(*args, **kwargs)
2025-12-04T12:42:03.8711693Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8711732Z     method(*args, **kwargs)
2025-12-04T12:42:03.8711883Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8711921Z     with policy():
2025-12-04T12:42:03.8712075Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8712115Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8712564Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1107296256 and is now 2850029568.
2025-12-04T12:42:03.8712567Z 
2025-12-04T12:42:03.8712643Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8713002Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8713006Z 
2025-12-04T12:42:03.8713095Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8713159Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8713221Z ======================= 1 failed, 14 deselected in 8.65s =======================
2025-12-04T12:42:03.8713258Z Got exit code 1
2025-12-04T12:42:03.8713301Z Retrying single test...
2025-12-04T12:42:03.8713537Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-548e425cbf16424e.xml
2025-12-04T12:42:03.8713596Z ============================= test session starts ==============================
2025-12-04T12:42:03.8713709Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8713767Z cachedir: .pytest_cache
2025-12-04T12:42:03.8713926Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8713981Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8714023Z configfile: pytest.ini
2025-12-04T12:42:03.8714186Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8714547Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8714599Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8714948Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8715008Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8715067Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8715402Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8715449Z Running 1 items in this shard
2025-12-04T12:42:03.8715453Z 
2025-12-04T12:42:03.8715870Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda I1204 12:39:03.209000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 473534
2025-12-04T12:42:03.8716025Z I1204 12:39:03.210000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 473535
2025-12-04T12:42:03.8716177Z I1204 12:39:03.211000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 473536
2025-12-04T12:42:03.8716328Z I1204 12:39:03.211000 473465 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 473537
2025-12-04T12:42:03.8717021Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8717065Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8717738Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8717783Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8718513Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8718568Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8719257Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8719299Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8719799Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8719849Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8720344Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8720391Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8720880Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8720929Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8721414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8721463Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8722152Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8722200Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8722880Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8722921Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8723632Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8723696Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8724184Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8724244Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8724911Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8724956Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8725444Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8725503Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8725990Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8726047Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8726529Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8726595Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8726832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8726879Z   local_shape = tensor.shape
2025-12-04T12:42:03.8727115Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8727158Z   local_shape = tensor.shape
2025-12-04T12:42:03.8727399Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8727438Z   tensor.shape,
2025-12-04T12:42:03.8727670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8727711Z   tensor.shape,
2025-12-04T12:42:03.8727941Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8728003Z   local_shape = tensor.shape
2025-12-04T12:42:03.8728262Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8728301Z   tensor.dtype,
2025-12-04T12:42:03.8728532Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8728571Z   tensor.dtype,
2025-12-04T12:42:03.8728802Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8728840Z   tensor.shape,
2025-12-04T12:42:03.8729074Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8729112Z   tensor.dtype,
2025-12-04T12:42:03.8729345Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8729385Z   local_shape = tensor.shape
2025-12-04T12:42:03.8729618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8729654Z   tensor.shape,
2025-12-04T12:42:03.8729885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8729921Z   tensor.dtype,
2025-12-04T12:42:03.8730057Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8730212Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8730496Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8730647Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8730927Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8731063Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8731335Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8731480Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8731768Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8731910Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8732179Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8732323Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8732606Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8732747Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8733314Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 950009856 and is now 2850029568.
2025-12-04T12:42:03.8733425Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8733617Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8734084Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8734194Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8734402Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8734561Z E1204 12:39:10.700000 473537 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8734694Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8734846Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8735127Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8735272Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8735560Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8735677Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8735947Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8736099Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8736367Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8736509Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8736788Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8736927Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8737199Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8737343Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8737910Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 3003121664.
2025-12-04T12:42:03.8738019Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8738241Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8738708Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8738816Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8739021Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8739180Z E1204 12:39:10.703000 473534 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8739310Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8739463Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8739761Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8739907Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8740186Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8740300Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8740584Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8740725Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8740995Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8741154Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8741434Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8741563Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8741833Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8741977Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8742540Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8742647Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8742837Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8746342Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8746464Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8746672Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8746831Z E1204 12:39:10.739000 473535 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8746962Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8747143Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8747429Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8747577Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8747867Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8747985Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8748294Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8748456Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8748724Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8748880Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8749152Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8749279Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8749553Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8749694Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8750260Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8750370Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8750560Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8751033Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8751141Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8751345Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8751502Z E1204 12:39:10.749000 473536 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8751568Z FAILED [8.8158s] [100%]
2025-12-04T12:42:03.8751571Z 
2025-12-04T12:42:03.8751631Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8751823Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8751875Z Traceback (most recent call last):
2025-12-04T12:42:03.8752040Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8752086Z     self._join_processes(fn)
2025-12-04T12:42:03.8752273Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8752332Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8752510Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8752556Z     raise RuntimeError(error)
2025-12-04T12:42:03.8752649Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8752698Z Traceback (most recent call last):
2025-12-04T12:42:03.8752871Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8752914Z     getattr(self, test_name)()
2025-12-04T12:42:03.8753072Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8753108Z     fn()
2025-12-04T12:42:03.8753261Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8753305Z     method(*args, **kwargs)
2025-12-04T12:42:03.8753457Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8753499Z     method(*args, **kwargs)
2025-12-04T12:42:03.8753650Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8753689Z     with policy():
2025-12-04T12:42:03.8753844Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8753885Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8754331Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8754334Z 
2025-12-04T12:42:03.8754412Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8754765Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8754769Z 
2025-12-04T12:42:03.8754859Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8754861Z 
2025-12-04T12:42:03.8754864Z 
2025-12-04T12:42:03.8754940Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8755029Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8755303Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-548e425cbf16424e.xml -
2025-12-04T12:42:03.8755365Z =========================== short test summary info ============================
2025-12-04T12:42:03.8755736Z FAILED [8.8158s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8755785Z Traceback (most recent call last):
2025-12-04T12:42:03.8755953Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8755997Z     getattr(self, test_name)()
2025-12-04T12:42:03.8756157Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8756206Z     fn()
2025-12-04T12:42:03.8756357Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8756399Z     method(*args, **kwargs)
2025-12-04T12:42:03.8756551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8756601Z     method(*args, **kwargs)
2025-12-04T12:42:03.8756750Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8756799Z     with policy():
2025-12-04T12:42:03.8756950Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8756991Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8757438Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2850029568.
2025-12-04T12:42:03.8757441Z 
2025-12-04T12:42:03.8757516Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8757867Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8757871Z 
2025-12-04T12:42:03.8757959Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8758025Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8758086Z ======================= 1 failed, 14 deselected in 8.95s =======================
2025-12-04T12:42:03.8758125Z Got exit code 1
2025-12-04T12:42:03.8758453Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8758583Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8758814Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d2e4c0b094a25d.xml
2025-12-04T12:42:03.8758873Z ============================= test session starts ==============================
2025-12-04T12:42:03.8758989Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8759141Z cachedir: .pytest_cache
2025-12-04T12:42:03.8759303Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8759350Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8759391Z configfile: pytest.ini
2025-12-04T12:42:03.8759556Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8759938Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8759992Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8760342Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8760415Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8760474Z collected 15 items / 10 deselected / 5 selected
2025-12-04T12:42:03.8760526Z stepcurrent: skipping 10 already run items.
2025-12-04T12:42:03.8760571Z Running 5 items in this shard
2025-12-04T12:42:03.8760573Z 
2025-12-04T12:42:03.8760994Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:39:14.540000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 473936
2025-12-04T12:42:03.8761179Z I1204 12:39:14.540000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 473937
2025-12-04T12:42:03.8761334Z I1204 12:39:14.541000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 473938
2025-12-04T12:42:03.8761486Z I1204 12:39:14.542000 473867 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 473939
2025-12-04T12:42:03.8762171Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8762217Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8762891Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8762934Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8763601Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8763644Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8764319Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8764370Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8764871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8764922Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8765420Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8765469Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8765956Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8766031Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8766517Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8766565Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8767242Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8767286Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8767955Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8767996Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8768716Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8768758Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8769264Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8769327Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8769999Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8770053Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8770541Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8770613Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8771109Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8771167Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8771649Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8771706Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8771943Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8771989Z   local_shape = tensor.shape
2025-12-04T12:42:03.8772223Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8772262Z   tensor.shape,
2025-12-04T12:42:03.8772494Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8772531Z   tensor.dtype,
2025-12-04T12:42:03.8772764Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8772809Z   local_shape = tensor.shape
2025-12-04T12:42:03.8773043Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8773079Z   tensor.shape,
2025-12-04T12:42:03.8773310Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8773348Z   tensor.dtype,
2025-12-04T12:42:03.8773578Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8773648Z   local_shape = tensor.shape
2025-12-04T12:42:03.8773981Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8774019Z   tensor.shape,
2025-12-04T12:42:03.8774252Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8774287Z   tensor.dtype,
2025-12-04T12:42:03.8774530Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8774572Z   local_shape = tensor.shape
2025-12-04T12:42:03.8774803Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8774840Z   tensor.shape,
2025-12-04T12:42:03.8775070Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8775117Z   tensor.dtype,
2025-12-04T12:42:03.8775266Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8775422Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8775709Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8775857Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8776137Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8776256Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8776527Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8776671Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8776939Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8777083Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8777352Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8777483Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8777757Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8777897Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8778518Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8778630Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8778820Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8779303Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8779412Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8779631Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8779805Z E1204 12:39:21.937000 473936 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8779936Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8780089Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8780369Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8780516Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8780798Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8780916Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8781186Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8781327Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8781596Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8781737Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8782005Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8782134Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8782405Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8782556Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8783117Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1262485504 and is now 2826960896.
2025-12-04T12:42:03.8783235Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8783426Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8783900Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8784034Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8784236Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8784394Z E1204 12:39:21.958000 473938 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8784524Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8784677Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8784956Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8785103Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8785381Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8785496Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8785766Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8785907Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8786176Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8786317Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8786588Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8786717Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8786997Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8787139Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8787712Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1113587712 and is now 2826960896.
2025-12-04T12:42:03.8787820Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8788010Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8788520Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8788642Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8788845Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8789002Z E1204 12:39:21.991000 473939 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8789134Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8789287Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8789567Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8789713Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8789991Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8790106Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8790375Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8790519Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8790787Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8790927Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8791207Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8791336Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8791607Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8791749Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8792328Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8792435Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8792638Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8793114Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8793222Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8793424Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8793583Z E1204 12:39:22.012000 473937 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8793626Z FAILED [8.5146s] [ 20%]
2025-12-04T12:42:03.8793629Z 
2025-12-04T12:42:03.8793687Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8793880Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8793927Z Traceback (most recent call last):
2025-12-04T12:42:03.8794094Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8794138Z     self._join_processes(fn)
2025-12-04T12:42:03.8794316Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8794371Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8794553Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8794597Z     raise RuntimeError(error)
2025-12-04T12:42:03.8794680Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8794724Z Traceback (most recent call last):
2025-12-04T12:42:03.8794887Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8794929Z     getattr(self, test_name)()
2025-12-04T12:42:03.8795089Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8795124Z     fn()
2025-12-04T12:42:03.8795288Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8795330Z     method(*args, **kwargs)
2025-12-04T12:42:03.8795485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8795525Z     method(*args, **kwargs)
2025-12-04T12:42:03.8795678Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8795716Z     with policy():
2025-12-04T12:42:03.8795870Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8795912Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8796365Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8796368Z 
2025-12-04T12:42:03.8796455Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8796805Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8796817Z 
2025-12-04T12:42:03.8796908Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8796910Z 
2025-12-04T12:42:03.8796912Z 
2025-12-04T12:42:03.8796990Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8797079Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8797355Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-51d2e4c0b094a25d.xml -
2025-12-04T12:42:03.8797417Z =========================== short test summary info ============================
2025-12-04T12:42:03.8797774Z FAILED [8.5146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8797821Z Traceback (most recent call last):
2025-12-04T12:42:03.8797988Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8798032Z     getattr(self, test_name)()
2025-12-04T12:42:03.8798231Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8798266Z     fn()
2025-12-04T12:42:03.8798423Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8798464Z     method(*args, **kwargs)
2025-12-04T12:42:03.8798617Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8798658Z     method(*args, **kwargs)
2025-12-04T12:42:03.8798811Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8798847Z     with policy():
2025-12-04T12:42:03.8799002Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8799043Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8799503Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8799507Z 
2025-12-04T12:42:03.8799583Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8799931Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8799933Z 
2025-12-04T12:42:03.8800023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8800106Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8800171Z ======================= 1 failed, 10 deselected in 8.66s =======================
2025-12-04T12:42:03.8800208Z Got exit code 1
2025-12-04T12:42:03.8800250Z Retrying single test...
2025-12-04T12:42:03.8800479Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8b9d0678f025b599.xml
2025-12-04T12:42:03.8800551Z ============================= test session starts ==============================
2025-12-04T12:42:03.8800677Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8800722Z cachedir: .pytest_cache
2025-12-04T12:42:03.8800883Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8800929Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8800972Z configfile: pytest.ini
2025-12-04T12:42:03.8801136Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8801498Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8801550Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8801900Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8801957Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8802015Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8802358Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8802404Z Running 1 items in this shard
2025-12-04T12:42:03.8802407Z 
2025-12-04T12:42:03.8802823Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:39:25.702000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 474338
2025-12-04T12:42:03.8802982Z I1204 12:39:25.703000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 474339
2025-12-04T12:42:03.8803139Z I1204 12:39:25.703000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 474340
2025-12-04T12:42:03.8803290Z I1204 12:39:25.704000 474269 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 474341
2025-12-04T12:42:03.8803984Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8804030Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8804711Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8804755Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8805422Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8805485Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8806155Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8806198Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8806699Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8806748Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8807241Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8807287Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8807778Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8807827Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8808359Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8808426Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8809100Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8809144Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8809827Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8809880Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8810549Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8810605Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8811094Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8811155Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8811638Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8811697Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8812180Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8812238Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8812914Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8812956Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8813451Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8813509Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8813748Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8813792Z   local_shape = tensor.shape
2025-12-04T12:42:03.8814038Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8814082Z   local_shape = tensor.shape
2025-12-04T12:42:03.8814315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8814353Z   tensor.shape,
2025-12-04T12:42:03.8814595Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8814650Z   local_shape = tensor.shape
2025-12-04T12:42:03.8814882Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8814920Z   tensor.dtype,
2025-12-04T12:42:03.8815152Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8815189Z   tensor.shape,
2025-12-04T12:42:03.8815420Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8815457Z   tensor.dtype,
2025-12-04T12:42:03.8815688Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8815725Z   tensor.shape,
2025-12-04T12:42:03.8815954Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8815993Z   tensor.dtype,
2025-12-04T12:42:03.8816224Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8816264Z   local_shape = tensor.shape
2025-12-04T12:42:03.8816495Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8816532Z   tensor.shape,
2025-12-04T12:42:03.8816762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8816798Z   tensor.dtype,
2025-12-04T12:42:03.8816935Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8817090Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8817378Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8817543Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8817823Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8817942Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8818249Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8818407Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8818676Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8818816Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8819098Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8819241Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8819511Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8819652Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8820217Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1113587712 and is now 2826960896.
2025-12-04T12:42:03.8820327Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8820518Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8820989Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8821100Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8821305Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8821462Z E1204 12:39:33.223000 474341 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8821593Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8821744Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8822035Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8822182Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8822460Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8822574Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8822853Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8822998Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8823263Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8823421Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8823687Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8823817Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8824088Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8824263Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8824823Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8824932Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8825122Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8825588Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8825696Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8825899Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8826056Z E1204 12:39:33.231000 474339 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8826188Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8826353Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8826635Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8826782Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8827068Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8827182Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8827451Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8827603Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8827880Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8828019Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8828317Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8828446Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8828714Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8828856Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8829419Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8829527Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8829715Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8830179Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8830289Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8830490Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8830662Z E1204 12:39:33.239000 474338 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8830794Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8830945Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8831224Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8831383Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8831663Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8831777Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8832059Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8832221Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8832490Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8832629Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8832896Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8833024Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8833295Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8833435Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8833995Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8834104Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8834295Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8834767Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8834874Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8835084Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8835242Z E1204 12:39:33.288000 474340 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8835282Z FAILED [8.7146s] [100%]
2025-12-04T12:42:03.8835286Z 
2025-12-04T12:42:03.8835341Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8835534Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8835592Z Traceback (most recent call last):
2025-12-04T12:42:03.8835755Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8835798Z     self._join_processes(fn)
2025-12-04T12:42:03.8835971Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8836035Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8836212Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8836264Z     raise RuntimeError(error)
2025-12-04T12:42:03.8836345Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8836390Z Traceback (most recent call last):
2025-12-04T12:42:03.8836553Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8836595Z     getattr(self, test_name)()
2025-12-04T12:42:03.8836753Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8836787Z     fn()
2025-12-04T12:42:03.8836941Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8836983Z     method(*args, **kwargs)
2025-12-04T12:42:03.8837134Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8837174Z     method(*args, **kwargs)
2025-12-04T12:42:03.8837324Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8837360Z     with policy():
2025-12-04T12:42:03.8837513Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8837553Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8837994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8837997Z 
2025-12-04T12:42:03.8838072Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8838459Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8838461Z 
2025-12-04T12:42:03.8838551Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8838554Z 
2025-12-04T12:42:03.8838556Z 
2025-12-04T12:42:03.8838630Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8838718Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8839002Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-8b9d0678f025b599.xml -
2025-12-04T12:42:03.8839063Z =========================== short test summary info ============================
2025-12-04T12:42:03.8839423Z FAILED [8.7146s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.8839469Z Traceback (most recent call last):
2025-12-04T12:42:03.8839649Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8839691Z     getattr(self, test_name)()
2025-12-04T12:42:03.8839852Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8839886Z     fn()
2025-12-04T12:42:03.8840037Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8840092Z     method(*args, **kwargs)
2025-12-04T12:42:03.8840257Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8840296Z     method(*args, **kwargs)
2025-12-04T12:42:03.8840446Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8840482Z     with policy():
2025-12-04T12:42:03.8840634Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8840674Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8841118Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8841121Z 
2025-12-04T12:42:03.8841198Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8841547Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8841550Z 
2025-12-04T12:42:03.8841639Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8841701Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8841764Z ======================= 1 failed, 14 deselected in 8.85s =======================
2025-12-04T12:42:03.8841801Z Got exit code 1
2025-12-04T12:42:03.8841842Z Retrying single test...
2025-12-04T12:42:03.8842069Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58cd89e09e93ed0b.xml
2025-12-04T12:42:03.8842129Z ============================= test session starts ==============================
2025-12-04T12:42:03.8842241Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8842284Z cachedir: .pytest_cache
2025-12-04T12:42:03.8842441Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8842487Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8842528Z configfile: pytest.ini
2025-12-04T12:42:03.8842692Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8843060Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8843112Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8843459Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8843515Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8843582Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8843928Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8843973Z Running 1 items in this shard
2025-12-04T12:42:03.8843986Z 
2025-12-04T12:42:03.8844402Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda I1204 12:39:36.935000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 474740
2025-12-04T12:42:03.8844568Z I1204 12:39:36.936000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 474741
2025-12-04T12:42:03.8844721Z I1204 12:39:36.937000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 474742
2025-12-04T12:42:03.8844870Z I1204 12:39:36.938000 474671 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 474743
2025-12-04T12:42:03.8845552Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8845597Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8846271Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8846314Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8846985Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8847027Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8847706Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8847749Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8848283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8848345Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8848834Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8848900Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8849390Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8849450Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8849937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8849983Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8850656Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8850699Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8851366Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8851408Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8852079Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8852119Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8852621Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8852682Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8853174Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8853233Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8853714Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8853792Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8854469Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8854511Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8854998Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8855055Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8855291Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8855334Z   local_shape = tensor.shape
2025-12-04T12:42:03.8855569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8855607Z   tensor.shape,
2025-12-04T12:42:03.8855838Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8855875Z   tensor.dtype,
2025-12-04T12:42:03.8856104Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8856149Z   local_shape = tensor.shape
2025-12-04T12:42:03.8856378Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8856420Z   local_shape = tensor.shape
2025-12-04T12:42:03.8856650Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8856686Z   tensor.shape,
2025-12-04T12:42:03.8856927Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8856965Z   tensor.shape,
2025-12-04T12:42:03.8857196Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8857233Z   tensor.dtype,
2025-12-04T12:42:03.8857463Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8857499Z   tensor.dtype,
2025-12-04T12:42:03.8857740Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8857783Z   local_shape = tensor.shape
2025-12-04T12:42:03.8858014Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8858060Z   tensor.shape,
2025-12-04T12:42:03.8858333Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8858384Z   tensor.dtype,
2025-12-04T12:42:03.8858520Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8858676Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8858959Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8859107Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8859386Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8859503Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8859776Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8859917Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8860185Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8860327Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8860594Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8860724Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8860994Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8861147Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8861710Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8861821Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8862024Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8862492Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8862615Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8862830Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8862988Z E1204 12:39:44.339000 474740 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8863120Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8863271Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8863550Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8863696Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8863973Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8864088Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8864358Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8864501Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8864771Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8864913Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8865179Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8865307Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8865587Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8865728Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8866304Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8866412Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8866603Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8867080Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8867197Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8867400Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8867557Z E1204 12:39:44.348000 474742 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8867688Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8867842Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8868124Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8868305Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8868587Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8868701Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8868969Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8869110Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8869379Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8869520Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8869800Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8869928Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8870198Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8870340Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8870915Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896.
2025-12-04T12:42:03.8871023Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8871225Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8871702Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8871810Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8872012Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8872169Z E1204 12:39:44.350000 474743 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8872298Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8872451Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8872731Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8872876Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8873155Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8873270Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8873538Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8873678Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8873946Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8874095Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8874365Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8874496Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8874796Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8874946Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8875504Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8875631Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8875820Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8876291Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8876399Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8876601Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8876760Z E1204 12:39:44.357000 474741 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8876799Z FAILED [8.6136s] [100%]
2025-12-04T12:42:03.8876802Z 
2025-12-04T12:42:03.8876859Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8877051Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.8877099Z Traceback (most recent call last):
2025-12-04T12:42:03.8877263Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8877306Z     self._join_processes(fn)
2025-12-04T12:42:03.8877480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8877533Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8877713Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8877757Z     raise RuntimeError(error)
2025-12-04T12:42:03.8877837Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8877882Z Traceback (most recent call last):
2025-12-04T12:42:03.8878045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8878087Z     getattr(self, test_name)()
2025-12-04T12:42:03.8878304Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8878339Z     fn()
2025-12-04T12:42:03.8878492Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8878535Z     method(*args, **kwargs)
2025-12-04T12:42:03.8878689Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8878729Z     method(*args, **kwargs)
2025-12-04T12:42:03.8878880Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8878917Z     with policy():
2025-12-04T12:42:03.8879083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8879123Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8879567Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8879594Z 
2025-12-04T12:42:03.8879670Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8880017Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8880019Z 
2025-12-04T12:42:03.8880107Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8880110Z 
2025-12-04T12:42:03.8880167Z Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8880212Z Traceback (most recent call last):
2025-12-04T12:42:03.8880375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8880419Z     getattr(self, test_name)()
2025-12-04T12:42:03.8880578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8880614Z     fn()
2025-12-04T12:42:03.8880765Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8880806Z     method(*args, **kwargs)
2025-12-04T12:42:03.8880958Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8880998Z     method(*args, **kwargs)
2025-12-04T12:42:03.8881147Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8881183Z     with policy():
2025-12-04T12:42:03.8881334Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8881376Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8881816Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8881819Z 
2025-12-04T12:42:03.8881892Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8882239Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8882242Z 
2025-12-04T12:42:03.8882353Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8882356Z 
2025-12-04T12:42:03.8882360Z 
2025-12-04T12:42:03.8882435Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8882525Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8882797Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-58cd89e09e93ed0b.xml -
2025-12-04T12:42:03.8882858Z =========================== short test summary info ============================
2025-12-04T12:42:03.8883227Z FAILED [8.6136s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8883276Z Traceback (most recent call last):
2025-12-04T12:42:03.8883440Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8883494Z     getattr(self, test_name)()
2025-12-04T12:42:03.8883664Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8883699Z     fn()
2025-12-04T12:42:03.8883850Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8883890Z     method(*args, **kwargs)
2025-12-04T12:42:03.8884040Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8884080Z     method(*args, **kwargs)
2025-12-04T12:42:03.8884228Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8884266Z     with policy():
2025-12-04T12:42:03.8884416Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8884458Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8884899Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8884901Z 
2025-12-04T12:42:03.8884975Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8885321Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8885324Z 
2025-12-04T12:42:03.8885409Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8885412Z 
2025-12-04T12:42:03.8885470Z Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.8885515Z Traceback (most recent call last):
2025-12-04T12:42:03.8885679Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8885720Z     getattr(self, test_name)()
2025-12-04T12:42:03.8885879Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8885914Z     fn()
2025-12-04T12:42:03.8886063Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8886103Z     method(*args, **kwargs)
2025-12-04T12:42:03.8886262Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8886302Z     method(*args, **kwargs)
2025-12-04T12:42:03.8886451Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8886492Z     with policy():
2025-12-04T12:42:03.8886641Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8886682Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8887128Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 4608 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8887131Z 
2025-12-04T12:42:03.8887207Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8887552Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8887577Z 
2025-12-04T12:42:03.8887662Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8887727Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8887789Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.8887826Z Got exit code 1
2025-12-04T12:42:03.8888121Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.8888284Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.8888512Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ba49d39c2b2d16df.xml
2025-12-04T12:42:03.8888572Z ============================= test session starts ==============================
2025-12-04T12:42:03.8888684Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8888725Z cachedir: .pytest_cache
2025-12-04T12:42:03.8888882Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8888929Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8888969Z configfile: pytest.ini
2025-12-04T12:42:03.8889133Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8889494Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8889547Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8889894Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8889952Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8890008Z collected 15 items / 11 deselected / 4 selected
2025-12-04T12:42:03.8890061Z stepcurrent: skipping 11 already run items.
2025-12-04T12:42:03.8890106Z Running 4 items in this shard
2025-12-04T12:42:03.8890108Z 
2025-12-04T12:42:03.8890545Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:39:48.074000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 475142
2025-12-04T12:42:03.8890702Z I1204 12:39:48.075000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 475143
2025-12-04T12:42:03.8890853Z I1204 12:39:48.076000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 475144
2025-12-04T12:42:03.8891016Z I1204 12:39:48.077000 475073 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 475145
2025-12-04T12:42:03.8891702Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8891769Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8892443Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8892484Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8893151Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8893195Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8893868Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8893910Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8894407Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8894458Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8894950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8894996Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8895496Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8895543Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8896040Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8896086Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8896764Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8896831Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8897499Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8897541Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8898243Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8898284Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8898774Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8898833Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8899321Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8899379Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8900065Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8900108Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8900592Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8900662Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8901200Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8901271Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8901508Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8901564Z   local_shape = tensor.shape
2025-12-04T12:42:03.8901800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8901838Z   tensor.shape,
2025-12-04T12:42:03.8902072Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8902108Z   tensor.dtype,
2025-12-04T12:42:03.8902341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8902383Z   local_shape = tensor.shape
2025-12-04T12:42:03.8902615Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8902650Z   tensor.shape,
2025-12-04T12:42:03.8902880Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8902918Z   tensor.dtype,
2025-12-04T12:42:03.8903148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8903192Z   local_shape = tensor.shape
2025-12-04T12:42:03.8903421Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8903458Z   tensor.shape,
2025-12-04T12:42:03.8903688Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8903724Z   tensor.dtype,
2025-12-04T12:42:03.8903954Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8903998Z   local_shape = tensor.shape
2025-12-04T12:42:03.8904230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8904278Z   tensor.shape,
2025-12-04T12:42:03.8904508Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8904547Z   tensor.dtype,
2025-12-04T12:42:03.8904684Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8904841Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8905139Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8905286Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8905566Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8905694Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8905975Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8906117Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8906387Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8906527Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8906796Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8906927Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8907199Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8907342Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8907909Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8908021Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8908243Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8908724Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8908833Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8909037Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8909196Z E1204 12:39:55.409000 475142 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8909326Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8909492Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8909777Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8909924Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8910216Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8910344Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8910613Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8910754Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8911023Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8911161Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8911431Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8911560Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8911830Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8911972Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8912534Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8912645Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8912833Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8913310Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8913419Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8913621Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8913788Z E1204 12:39:55.416000 475145 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8913917Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8914069Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8914347Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8914527Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8914805Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8914921Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8915194Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8915333Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8915602Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8915741Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8916012Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8916140Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8916410Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8916551Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8917109Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8917216Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8917415Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8917884Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8917992Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8918237Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8918396Z E1204 12:39:55.429000 475143 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8918526Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8918690Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8919061Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8919207Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8919485Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8919600Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8919868Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8920008Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8920276Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8920414Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8920682Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8920809Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8921082Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8921222Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8921792Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8921900Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8922088Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8922564Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8922671Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8922873Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8923030Z E1204 12:39:55.479000 475144 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8923080Z FAILED [8.5171s] [ 25%]
2025-12-04T12:42:03.8923094Z 
2025-12-04T12:42:03.8923151Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8923342Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8923390Z Traceback (most recent call last):
2025-12-04T12:42:03.8923554Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8923597Z     self._join_processes(fn)
2025-12-04T12:42:03.8923769Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8923824Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8924002Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8924047Z     raise RuntimeError(error)
2025-12-04T12:42:03.8924125Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8924172Z Traceback (most recent call last):
2025-12-04T12:42:03.8924332Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8924374Z     getattr(self, test_name)()
2025-12-04T12:42:03.8924535Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8924568Z     fn()
2025-12-04T12:42:03.8924722Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8924761Z     method(*args, **kwargs)
2025-12-04T12:42:03.8924912Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8924951Z     method(*args, **kwargs)
2025-12-04T12:42:03.8925102Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8925139Z     with policy():
2025-12-04T12:42:03.8925305Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8925358Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8925817Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8925820Z 
2025-12-04T12:42:03.8925897Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8926242Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8926246Z 
2025-12-04T12:42:03.8926334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8926336Z 
2025-12-04T12:42:03.8926338Z 
2025-12-04T12:42:03.8926423Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8926512Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8926784Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-ba49d39c2b2d16df.xml -
2025-12-04T12:42:03.8926846Z =========================== short test summary info ============================
2025-12-04T12:42:03.8927212Z FAILED [8.5171s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.8927268Z Traceback (most recent call last):
2025-12-04T12:42:03.8927433Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8927476Z     getattr(self, test_name)()
2025-12-04T12:42:03.8927635Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8927671Z     fn()
2025-12-04T12:42:03.8927822Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8927864Z     method(*args, **kwargs)
2025-12-04T12:42:03.8928016Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8928056Z     method(*args, **kwargs)
2025-12-04T12:42:03.8928240Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8928277Z     with policy():
2025-12-04T12:42:03.8928429Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8928471Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8928913Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8928916Z 
2025-12-04T12:42:03.8928990Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8929334Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8929336Z 
2025-12-04T12:42:03.8929423Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8929486Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8929547Z ======================= 1 failed, 11 deselected in 8.66s =======================
2025-12-04T12:42:03.8929584Z Got exit code 1
2025-12-04T12:42:03.8929623Z Retrying single test...
2025-12-04T12:42:03.8929861Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1732150cd52e220b.xml
2025-12-04T12:42:03.8929921Z ============================= test session starts ==============================
2025-12-04T12:42:03.8930034Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8930075Z cachedir: .pytest_cache
2025-12-04T12:42:03.8930234Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8930281Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8930340Z configfile: pytest.ini
2025-12-04T12:42:03.8930504Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8930864Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8930930Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8931288Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8931346Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8931402Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8931740Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8931785Z Running 1 items in this shard
2025-12-04T12:42:03.8931788Z 
2025-12-04T12:42:03.8932202Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:39:59.401000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 475544
2025-12-04T12:42:03.8932362Z I1204 12:39:59.402000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 475545
2025-12-04T12:42:03.8932514Z I1204 12:39:59.403000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 475546
2025-12-04T12:42:03.8932667Z I1204 12:39:59.403000 475475 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 475547
2025-12-04T12:42:03.8933347Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8933392Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8934062Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8934105Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8934785Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8934829Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8935510Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8935552Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8936057Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8936117Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8936611Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8936658Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8937147Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8937194Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8937681Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8937728Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8938434Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8938478Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8939167Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8939210Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8939889Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8939931Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8940418Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8940490Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8940986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8941046Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8941719Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8941763Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8942246Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8942303Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8942783Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8942840Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8943076Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8943119Z   local_shape = tensor.shape
2025-12-04T12:42:03.8943354Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8943396Z   local_shape = tensor.shape
2025-12-04T12:42:03.8943638Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8943675Z   tensor.shape,
2025-12-04T12:42:03.8943908Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8943950Z   local_shape = tensor.shape
2025-12-04T12:42:03.8944182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8944218Z   tensor.shape,
2025-12-04T12:42:03.8944461Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8944499Z   tensor.dtype,
2025-12-04T12:42:03.8944730Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8944766Z   tensor.dtype,
2025-12-04T12:42:03.8945006Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8945053Z   tensor.shape,
2025-12-04T12:42:03.8945283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8945319Z   tensor.dtype,
2025-12-04T12:42:03.8945549Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8945591Z   local_shape = tensor.shape
2025-12-04T12:42:03.8945821Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8945858Z   tensor.shape,
2025-12-04T12:42:03.8946088Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8946125Z   tensor.dtype,
2025-12-04T12:42:03.8946261Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8946416Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8946701Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8946850Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8947128Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8947245Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8947515Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8947656Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8947940Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8948079Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8948381Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8948512Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8948798Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8948941Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8949504Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896.
2025-12-04T12:42:03.8949638Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8949826Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8950295Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8950404Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8950607Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8950764Z E1204 12:40:06.754000 475547 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8950895Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8951047Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8951326Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8951472Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8951752Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8951866Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8952135Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8952290Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8952558Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8952697Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8952964Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8953102Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8953376Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8953517Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8954083Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8954204Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8954392Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8954859Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8954968Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8955169Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8955327Z E1204 12:40:06.787000 475546 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.8955456Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8955610Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8955889Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8956036Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8956313Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8956427Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8956703Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8956844Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8957112Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8957251Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8957527Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8957655Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8957924Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8958085Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8958686Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8958794Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8958980Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8959446Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8959554Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8959755Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8959912Z E1204 12:40:06.804000 475544 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8960041Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8960193Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8960472Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8960618Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8960896Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8961025Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8961295Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8961434Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8961717Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8961856Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8962124Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8962264Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8962551Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8962693Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8963251Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.8963359Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8963548Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8964017Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8964122Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8964324Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8964481Z E1204 12:40:06.832000 475545 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.8964521Z FAILED [8.6144s] [100%]
2025-12-04T12:42:03.8964524Z 
2025-12-04T12:42:03.8964580Z =================================== FAILURES ===================================
2025-12-04T12:42:03.8964770Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.8964818Z Traceback (most recent call last):
2025-12-04T12:42:03.8964982Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.8965026Z     self._join_processes(fn)
2025-12-04T12:42:03.8965208Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.8965263Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.8965440Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.8965485Z     raise RuntimeError(error)
2025-12-04T12:42:03.8965565Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8965611Z Traceback (most recent call last):
2025-12-04T12:42:03.8965772Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8965824Z     getattr(self, test_name)()
2025-12-04T12:42:03.8965983Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8966017Z     fn()
2025-12-04T12:42:03.8966170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8966222Z     method(*args, **kwargs)
2025-12-04T12:42:03.8966373Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8966424Z     method(*args, **kwargs)
2025-12-04T12:42:03.8966575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8966612Z     with policy():
2025-12-04T12:42:03.8966766Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8966806Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8967248Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896.
2025-12-04T12:42:03.8967251Z 
2025-12-04T12:42:03.8967326Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8967672Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8967675Z 
2025-12-04T12:42:03.8967762Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8967765Z 
2025-12-04T12:42:03.8967768Z 
2025-12-04T12:42:03.8967843Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.8967932Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.8968247Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-1732150cd52e220b.xml -
2025-12-04T12:42:03.8968311Z =========================== short test summary info ============================
2025-12-04T12:42:03.8968668Z FAILED [8.6144s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.8968715Z Traceback (most recent call last):
2025-12-04T12:42:03.8968881Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8968925Z     getattr(self, test_name)()
2025-12-04T12:42:03.8969084Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8969135Z     fn()
2025-12-04T12:42:03.8969287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8969329Z     method(*args, **kwargs)
2025-12-04T12:42:03.8969479Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8969520Z     method(*args, **kwargs)
2025-12-04T12:42:03.8969670Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8969707Z     with policy():
2025-12-04T12:42:03.8969871Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8969912Z     raise RuntimeError(msg)
2025-12-04T12:42:03.8970355Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1262485504 and is now 2826960896.
2025-12-04T12:42:03.8970370Z 
2025-12-04T12:42:03.8970459Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8970807Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8970810Z 
2025-12-04T12:42:03.8970897Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8970961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.8971021Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.8971060Z Got exit code 1
2025-12-04T12:42:03.8971100Z Retrying single test...
2025-12-04T12:42:03.8971325Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-05ca25794532c849.xml
2025-12-04T12:42:03.8971384Z ============================= test session starts ==============================
2025-12-04T12:42:03.8971496Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.8971537Z cachedir: .pytest_cache
2025-12-04T12:42:03.8971694Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.8971742Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.8971782Z configfile: pytest.ini
2025-12-04T12:42:03.8971946Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.8972305Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8972357Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.8972701Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.8972759Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.8972816Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.8973165Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8973209Z Running 1 items in this shard
2025-12-04T12:42:03.8973213Z 
2025-12-04T12:42:03.8973628Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda I1204 12:40:10.680000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 475946
2025-12-04T12:42:03.8973785Z I1204 12:40:10.681000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 475947
2025-12-04T12:42:03.8973947Z I1204 12:40:10.682000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 475948
2025-12-04T12:42:03.8974097Z I1204 12:40:10.683000 475877 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 475949
2025-12-04T12:42:03.8974777Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8974843Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8975516Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8975558Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8976264Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8976306Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8976977Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:113: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8977020Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8977515Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8977566Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8978066Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8978115Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8978630Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8978676Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8979183Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8979229Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.8979925Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8979982Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8980648Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8980690Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8981361Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8981403Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8981894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8981954Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8982438Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8982494Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8983177Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:124: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.8983221Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.8983716Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8983773Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8984258Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_shard_utils.py:59: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.8984334Z   distributed_c10d._get_pg_default_device(pg).type
2025-12-04T12:42:03.8984571Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8984613Z   local_shape = tensor.shape
2025-12-04T12:42:03.8984849Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8984886Z   tensor.shape,
2025-12-04T12:42:03.8985118Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8985155Z   tensor.dtype,
2025-12-04T12:42:03.8985386Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8985429Z   local_shape = tensor.shape
2025-12-04T12:42:03.8985660Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8985696Z   tensor.shape,
2025-12-04T12:42:03.8985928Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8985963Z   tensor.dtype,
2025-12-04T12:42:03.8986195Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8986238Z   local_shape = tensor.shape
2025-12-04T12:42:03.8986471Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8986508Z   tensor.shape,
2025-12-04T12:42:03.8986739Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8986774Z   tensor.dtype,
2025-12-04T12:42:03.8987006Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:732: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8987048Z   local_shape = tensor.shape
2025-12-04T12:42:03.8987287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:749: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8987325Z   tensor.shape,
2025-12-04T12:42:03.8987555Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_state_dict_utils.py:751: FutureWarning: Please use DTensor instead and we are deprecating ShardedTensor.
2025-12-04T12:42:03.8987593Z   tensor.dtype,
2025-12-04T12:42:03.8987728Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8987894Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8988216Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8988365Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8988658Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8988790Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8989063Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8989205Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8989476Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8989616Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8989888Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8990016Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8990287Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8990429Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8990990Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896.
2025-12-04T12:42:03.8991101Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8991292Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8991772Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8991882Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8992086Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8992245Z E1204 12:40:18.079000 475949 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.8992387Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8992541Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8992820Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8992977Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8993264Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8993380Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8993649Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8993790Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8994063Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8994202Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8994478Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.8994604Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.8994875Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.8995016Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.8995576Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 0. CUDA driver allocated memory was 1421869056 and is now 2973761536.
2025-12-04T12:42:03.8995684Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8995884Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.8996349Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.8996459Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.8996679Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.8996838Z E1204 12:40:18.085000 475946 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.8996967Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.8997119Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.8997406Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.8997564Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.8997841Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.8997956Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.8998260Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8998402Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.8998673Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.8998813Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9000854Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9000991Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9001265Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9001408Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9001972Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 1. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.9002107Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9002296Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9002761Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9002883Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9003086Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9003244Z E1204 12:40:18.109000 475947 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9003389Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9003556Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9003841Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9003991Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9004270Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9004386Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9004655Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9004798Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9005068Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9005207Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9005475Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9005602Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9005873Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9006013Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9006582Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 2. CUDA driver allocated memory was 1268776960 and is now 2826960896.
2025-12-04T12:42:03.9006691Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9006882Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9007358Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9007465Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9007667Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9007834Z E1204 12:40:18.159000 475948 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9007887Z FAILED [8.6141s] [100%]
2025-12-04T12:42:03.9007889Z 
2025-12-04T12:42:03.9007948Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9008140Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.9008226Z Traceback (most recent call last):
2025-12-04T12:42:03.9008392Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9008437Z     self._join_processes(fn)
2025-12-04T12:42:03.9008610Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9008666Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9008844Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9008888Z     raise RuntimeError(error)
2025-12-04T12:42:03.9008970Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9009016Z Traceback (most recent call last):
2025-12-04T12:42:03.9009180Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9009224Z     getattr(self, test_name)()
2025-12-04T12:42:03.9009385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9009420Z     fn()
2025-12-04T12:42:03.9009572Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9009615Z     method(*args, **kwargs)
2025-12-04T12:42:03.9009765Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9009805Z     method(*args, **kwargs)
2025-12-04T12:42:03.9009955Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9009992Z     with policy():
2025-12-04T12:42:03.9010144Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9010185Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9010641Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896.
2025-12-04T12:42:03.9010645Z 
2025-12-04T12:42:03.9010721Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9011067Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9011069Z 
2025-12-04T12:42:03.9011170Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9011172Z 
2025-12-04T12:42:03.9011174Z 
2025-12-04T12:42:03.9011252Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9011342Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9011617Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-05ca25794532c849.xml -
2025-12-04T12:42:03.9011704Z =========================== short test summary info ============================
2025-12-04T12:42:03.9012061Z FAILED [8.6141s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9012108Z Traceback (most recent call last):
2025-12-04T12:42:03.9012272Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9012314Z     getattr(self, test_name)()
2025-12-04T12:42:03.9012473Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9012509Z     fn()
2025-12-04T12:42:03.9012659Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9012701Z     method(*args, **kwargs)
2025-12-04T12:42:03.9012851Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9012890Z     method(*args, **kwargs)
2025-12-04T12:42:03.9013040Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9013079Z     with policy():
2025-12-04T12:42:03.9013230Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9013270Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9013714Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 14848 on device 3. CUDA driver allocated memory was 1254096896 and is now 2826960896.
2025-12-04T12:42:03.9013718Z 
2025-12-04T12:42:03.9013792Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9014139Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9014142Z 
2025-12-04T12:42:03.9014228Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9014292Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9014370Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.9014408Z Got exit code 1
2025-12-04T12:42:03.9014699Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9014831Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.9015059Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a4b4f2efd2b27f6d.xml
2025-12-04T12:42:03.9015129Z ============================= test session starts ==============================
2025-12-04T12:42:03.9015243Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9015283Z cachedir: .pytest_cache
2025-12-04T12:42:03.9015443Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9015500Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9015541Z configfile: pytest.ini
2025-12-04T12:42:03.9015706Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9016076Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9016127Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9016475Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9016532Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9016590Z collected 15 items / 12 deselected / 3 selected
2025-12-04T12:42:03.9016642Z stepcurrent: skipping 12 already run items.
2025-12-04T12:42:03.9016687Z Running 3 items in this shard
2025-12-04T12:42:03.9016690Z 
2025-12-04T12:42:03.9017063Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 12:40:21.894000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 476348
2025-12-04T12:42:03.9017219Z I1204 12:40:21.895000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 476349
2025-12-04T12:42:03.9017371Z I1204 12:40:21.895000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 476350
2025-12-04T12:42:03.9017521Z I1204 12:40:21.896000 476279 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 476351
2025-12-04T12:42:03.9018225Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9018269Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9018956Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9019000Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9019683Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9019727Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9020392Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9020467Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9020968Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9021016Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9021507Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9021554Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9022041Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9022088Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9022575Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9022622Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9022758Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9022914Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9023196Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9023345Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9023635Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9023753Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9024024Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9024173Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9024442Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9024582Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9024862Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9025005Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9025276Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9025416Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9025938Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9026050Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9026239Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9026703Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9026813Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9027018Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9027177Z E1204 12:40:29.349000 476348 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9027307Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9027461Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9027739Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9027897Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9028204Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9028321Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9028604Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9028744Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9029013Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9029164Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9029453Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9029580Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9029851Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9029992Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9030506Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9030615Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9030803Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9031219Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9031327Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9031533Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9031690Z E1204 12:40:29.363000 476349 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9031820Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9031972Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9032266Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9032413Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9032691Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9032806Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9033086Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9033229Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9033508Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9033656Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9033924Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9034051Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9034324Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9034466Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9034978Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1101004800 and is now 2587885568.
2025-12-04T12:42:03.9035085Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9035273Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9035688Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9035795Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9035998Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9036156Z E1204 12:40:29.401000 476351 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9036284Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9036447Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9036728Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9036876Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9037162Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9037277Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9037545Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9037698Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9037977Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9038115Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9038418Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9038546Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9038815Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9038956Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9039467Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9039575Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9039766Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9040183Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9040289Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9040491Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9040646Z E1204 12:40:29.412000 476350 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9040687Z FAILED [8.6143s] [ 33%]
2025-12-04T12:42:03.9040703Z 
2025-12-04T12:42:03.9040759Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9040905Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.9040953Z Traceback (most recent call last):
2025-12-04T12:42:03.9041115Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9041160Z     self._join_processes(fn)
2025-12-04T12:42:03.9041332Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9041399Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9041576Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9041620Z     raise RuntimeError(error)
2025-12-04T12:42:03.9041700Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.9041760Z Traceback (most recent call last):
2025-12-04T12:42:03.9041921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9041976Z     getattr(self, test_name)()
2025-12-04T12:42:03.9042134Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9042170Z     fn()
2025-12-04T12:42:03.9042321Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9042362Z     method(*args, **kwargs)
2025-12-04T12:42:03.9042513Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9042553Z     method(*args, **kwargs)
2025-12-04T12:42:03.9042703Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9042742Z     with policy():
2025-12-04T12:42:03.9042892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9042934Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9043327Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9043331Z 
2025-12-04T12:42:03.9043405Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9043700Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9043703Z 
2025-12-04T12:42:03.9043791Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9043793Z 
2025-12-04T12:42:03.9043796Z 
2025-12-04T12:42:03.9043872Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9043959Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9044234Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-a4b4f2efd2b27f6d.xml -
2025-12-04T12:42:03.9044294Z =========================== short test summary info ============================
2025-12-04T12:42:03.9044617Z FAILED [8.6143s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:42:03.9044665Z Traceback (most recent call last):
2025-12-04T12:42:03.9044830Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9044874Z     getattr(self, test_name)()
2025-12-04T12:42:03.9045033Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9045070Z     fn()
2025-12-04T12:42:03.9045220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9045273Z     method(*args, **kwargs)
2025-12-04T12:42:03.9045426Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9045467Z     method(*args, **kwargs)
2025-12-04T12:42:03.9045617Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9045673Z     with policy():
2025-12-04T12:42:03.9045824Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9045876Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9046269Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9046272Z 
2025-12-04T12:42:03.9046346Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9046644Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9046646Z 
2025-12-04T12:42:03.9046734Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9046798Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9046860Z ======================= 1 failed, 12 deselected in 8.75s =======================
2025-12-04T12:42:03.9046901Z Got exit code 1
2025-12-04T12:42:03.9046940Z Retrying single test...
2025-12-04T12:42:03.9047171Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f24772aeb8490283.xml
2025-12-04T12:42:03.9047229Z ============================= test session starts ==============================
2025-12-04T12:42:03.9047344Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9047384Z cachedir: .pytest_cache
2025-12-04T12:42:03.9047544Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9047591Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9047633Z configfile: pytest.ini
2025-12-04T12:42:03.9047800Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9048198Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9048249Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9048611Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9048670Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9048728Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9049019Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9049063Z Running 1 items in this shard
2025-12-04T12:42:03.9049066Z 
2025-12-04T12:42:03.9049450Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 12:40:33.196000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 476750
2025-12-04T12:42:03.9049606Z I1204 12:40:33.197000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 476751
2025-12-04T12:42:03.9049759Z I1204 12:40:33.197000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 476752
2025-12-04T12:42:03.9049921Z I1204 12:40:33.198000 476681 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 476753
2025-12-04T12:42:03.9050618Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9050663Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9051334Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9051379Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9052046Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9052087Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9052757Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9052799Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9053300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9053362Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9053854Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9053903Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9054401Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9054451Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9054936Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9055003Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9055139Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9055294Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9055580Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9055727Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9056007Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9056124Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9056395Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9056537Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9056806Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9056947Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9057215Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9057347Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9057615Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9057770Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9058319Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568.
2025-12-04T12:42:03.9058428Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9058634Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9059051Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9059173Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9059390Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9059550Z E1204 12:40:40.555000 476753 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9059680Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9059836Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9060117Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9060263Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9060543Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9060658Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9060928Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9061069Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9061343Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9061484Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9061752Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9061880Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9062168Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9062312Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9062847Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9062955Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9063145Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9063558Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9063687Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9063889Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9064048Z E1204 12:40:40.568000 476752 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9064177Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9064329Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9064611Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9064758Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9065037Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9065151Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9065419Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9065559Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9065827Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9065965Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9066234Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9066370Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9066640Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9066782Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9067302Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9067411Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9067600Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9068024Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9068142Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9068397Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9068553Z E1204 12:40:40.638000 476750 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9068682Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9068835Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9069114Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9069259Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9069538Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9069652Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9069923Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9070063Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9070332Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9070471Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9070755Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9070884Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9071153Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9071294Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9071818Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9071925Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9072128Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9072555Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9072663Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9072865Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9073024Z E1204 12:40:40.639000 476751 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9073063Z FAILED [8.5135s] [100%]
2025-12-04T12:42:03.9073065Z 
2025-12-04T12:42:03.9073121Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9073265Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.9073311Z Traceback (most recent call last):
2025-12-04T12:42:03.9073474Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9073518Z     self._join_processes(fn)
2025-12-04T12:42:03.9073689Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9073743Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9073920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9073965Z     raise RuntimeError(error)
2025-12-04T12:42:03.9074044Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9074090Z Traceback (most recent call last):
2025-12-04T12:42:03.9074250Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9074293Z     getattr(self, test_name)()
2025-12-04T12:42:03.9074453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9074486Z     fn()
2025-12-04T12:42:03.9074638Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9074678Z     method(*args, **kwargs)
2025-12-04T12:42:03.9074838Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9074878Z     method(*args, **kwargs)
2025-12-04T12:42:03.9075029Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9075066Z     with policy():
2025-12-04T12:42:03.9075217Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9075257Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9075665Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568.
2025-12-04T12:42:03.9075667Z 
2025-12-04T12:42:03.9075743Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9076041Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9076063Z 
2025-12-04T12:42:03.9076152Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9076154Z 
2025-12-04T12:42:03.9076156Z 
2025-12-04T12:42:03.9076229Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9076318Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9076589Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f24772aeb8490283.xml -
2025-12-04T12:42:03.9076651Z =========================== short test summary info ============================
2025-12-04T12:42:03.9076990Z FAILED [8.5135s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9077039Z Traceback (most recent call last):
2025-12-04T12:42:03.9077204Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9077246Z     getattr(self, test_name)()
2025-12-04T12:42:03.9077406Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9077441Z     fn()
2025-12-04T12:42:03.9077592Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9077632Z     method(*args, **kwargs)
2025-12-04T12:42:03.9077784Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9077825Z     method(*args, **kwargs)
2025-12-04T12:42:03.9077975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9078013Z     with policy():
2025-12-04T12:42:03.9078197Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9078237Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9078636Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568.
2025-12-04T12:42:03.9078638Z 
2025-12-04T12:42:03.9078735Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9079033Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9079037Z 
2025-12-04T12:42:03.9079123Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9079186Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9079247Z ======================= 1 failed, 14 deselected in 8.68s =======================
2025-12-04T12:42:03.9079284Z Got exit code 1
2025-12-04T12:42:03.9079336Z Retrying single test...
2025-12-04T12:42:03.9079563Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f73ec1e65b79e9d8.xml
2025-12-04T12:42:03.9079622Z ============================= test session starts ==============================
2025-12-04T12:42:03.9079732Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9079789Z cachedir: .pytest_cache
2025-12-04T12:42:03.9079948Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9080006Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9080045Z configfile: pytest.ini
2025-12-04T12:42:03.9080209Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9080569Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9080620Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9080968Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9081028Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9081083Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9081373Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9081416Z Running 1 items in this shard
2025-12-04T12:42:03.9081420Z 
2025-12-04T12:42:03.9081794Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda I1204 12:40:44.420000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 477152
2025-12-04T12:42:03.9081951Z I1204 12:40:44.421000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 477153
2025-12-04T12:42:03.9082102Z I1204 12:40:44.422000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 477154
2025-12-04T12:42:03.9082254Z I1204 12:40:44.423000 477083 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 477155
2025-12-04T12:42:03.9082949Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9082993Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9083672Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9083715Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9084393Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9084445Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9085115Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9085166Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9085663Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9085712Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9086201Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9086250Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9086737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9086784Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9087270Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9087317Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9087452Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9087606Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9087898Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9088047Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9088362Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9088491Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9088760Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9088902Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9089184Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9089336Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9089605Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9089734Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9090004Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9090144Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9090665Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9090773Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9090964Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9091379Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9091488Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9091692Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9091849Z E1204 12:40:51.930000 477152 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9091991Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9092144Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9092424Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9092571Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9092863Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9092978Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9093247Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9093398Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9093684Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9093824Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9094091Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9094219Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9094490Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9094631Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9095143Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9095251Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9095440Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9095857Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9095964Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9096168Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9096334Z E1204 12:40:51.932000 477153 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9096465Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9096617Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9096897Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9097051Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9097328Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9097442Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9097720Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9097871Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9098139Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9098317Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9098587Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9098715Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9098985Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9099126Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9099639Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568.
2025-12-04T12:42:03.9099745Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9099933Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9100347Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9100454Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9100668Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9100826Z E1204 12:40:51.961000 477155 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9100957Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9101110Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9101400Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9101545Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9101822Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9101948Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9102231Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9102370Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9102639Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9102779Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9103045Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9103174Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9103442Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9103583Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9104101Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9104208Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9104396Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9104809Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9104927Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9105130Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9105288Z E1204 12:40:51.980000 477154 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9105327Z FAILED [8.8147s] [100%]
2025-12-04T12:42:03.9105330Z 
2025-12-04T12:42:03.9105385Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9105529Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda _
2025-12-04T12:42:03.9105585Z Traceback (most recent call last):
2025-12-04T12:42:03.9105749Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9105792Z     self._join_processes(fn)
2025-12-04T12:42:03.9105967Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9106028Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9106206Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9106260Z     raise RuntimeError(error)
2025-12-04T12:42:03.9106340Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9106385Z Traceback (most recent call last):
2025-12-04T12:42:03.9106548Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9106589Z     getattr(self, test_name)()
2025-12-04T12:42:03.9106751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9106784Z     fn()
2025-12-04T12:42:03.9106937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9106978Z     method(*args, **kwargs)
2025-12-04T12:42:03.9107131Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9107171Z     method(*args, **kwargs)
2025-12-04T12:42:03.9107321Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9107357Z     with policy():
2025-12-04T12:42:03.9107510Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9107550Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9107945Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9107950Z 
2025-12-04T12:42:03.9108026Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9108358Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9108360Z 
2025-12-04T12:42:03.9108449Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9108451Z 
2025-12-04T12:42:03.9108453Z 
2025-12-04T12:42:03.9108528Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9108615Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9108910Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f73ec1e65b79e9d8.xml -
2025-12-04T12:42:03.9108971Z =========================== short test summary info ============================
2025-12-04T12:42:03.9109282Z FAILED [8.8147s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9109329Z Traceback (most recent call last):
2025-12-04T12:42:03.9109495Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9109551Z     getattr(self, test_name)()
2025-12-04T12:42:03.9109712Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9109746Z     fn()
2025-12-04T12:42:03.9109898Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9109955Z     method(*args, **kwargs)
2025-12-04T12:42:03.9110106Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9110156Z     method(*args, **kwargs)
2025-12-04T12:42:03.9110306Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9110342Z     with policy():
2025-12-04T12:42:03.9110495Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9110535Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9110929Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda! Caching allocator allocated memory was 0 and is now reported as 2560 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9110932Z 
2025-12-04T12:42:03.9111006Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9111300Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9111304Z 
2025-12-04T12:42:03.9111391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9111453Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9111515Z ======================= 1 failed, 14 deselected in 8.95s =======================
2025-12-04T12:42:03.9111551Z Got exit code 1
2025-12-04T12:42:03.9111795Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda
2025-12-04T12:42:03.9111923Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.9112148Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-889307325d7c8e37.xml
2025-12-04T12:42:03.9112206Z ============================= test session starts ==============================
2025-12-04T12:42:03.9112319Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9112359Z cachedir: .pytest_cache
2025-12-04T12:42:03.9112522Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9112567Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9112608Z configfile: pytest.ini
2025-12-04T12:42:03.9112780Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9113142Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9113193Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9113548Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9113606Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9113662Z collected 15 items / 13 deselected / 2 selected
2025-12-04T12:42:03.9113714Z stepcurrent: skipping 13 already run items.
2025-12-04T12:42:03.9113758Z Running 2 items in this shard
2025-12-04T12:42:03.9113760Z 
2025-12-04T12:42:03.9114128Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 12:40:55.801000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 477554
2025-12-04T12:42:03.9114302Z I1204 12:40:55.802000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 477555
2025-12-04T12:42:03.9114454Z I1204 12:40:55.802000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 477556
2025-12-04T12:42:03.9114605Z I1204 12:40:55.803000 477485 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 477557
2025-12-04T12:42:03.9115281Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9115327Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9115998Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9116041Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9116707Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9116749Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9117426Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9117466Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9117962Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9118011Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9118540Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9118589Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9119075Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9119145Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9119630Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9119676Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9119811Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9119967Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9120252Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9120399Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9120678Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9120794Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9121064Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9121205Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9121473Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9121612Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9121890Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9122021Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9122293Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9122433Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9122956Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 950009856 and is now 2587885568.
2025-12-04T12:42:03.9123073Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9123264Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9123688Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9123797Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9124001Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9124158Z E1204 12:41:03.224000 477557 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9124290Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9124442Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9124726Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9124872Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9125149Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9125265Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9125534Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9125673Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9125942Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9126093Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9126360Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9126490Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9126759Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9126917Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9127428Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9127584Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9127784Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9128260Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9128370Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9128573Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9128730Z E1204 12:41:03.237000 477556 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9128861Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9129013Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9129295Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9129440Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9129717Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9129832Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9130108Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9130247Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9130536Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9130676Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9130943Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9131072Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9131354Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9131495Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9132002Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9132135Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9132325Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9132737Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9132844Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9133048Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9133207Z E1204 12:41:03.271000 477554 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9133336Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9133490Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9133769Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9133915Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9134192Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9134307Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9134576Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9134715Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9134992Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9135132Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9135400Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9135537Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9135809Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9135950Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9136466Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9136584Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9136772Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9137185Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9137292Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9137494Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9137650Z E1204 12:41:03.275000 477555 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9137690Z FAILED [8.6153s] [ 50%]
2025-12-04T12:42:03.9137692Z 
2025-12-04T12:42:03.9137748Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9137889Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.9137937Z Traceback (most recent call last):
2025-12-04T12:42:03.9138100Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9138172Z     self._join_processes(fn)
2025-12-04T12:42:03.9138348Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9138401Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9138582Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9138626Z     raise RuntimeError(error)
2025-12-04T12:42:03.9138708Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9138753Z Traceback (most recent call last):
2025-12-04T12:42:03.9138931Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9138974Z     getattr(self, test_name)()
2025-12-04T12:42:03.9139133Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9139168Z     fn()
2025-12-04T12:42:03.9139319Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9139359Z     method(*args, **kwargs)
2025-12-04T12:42:03.9139511Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9139551Z     method(*args, **kwargs)
2025-12-04T12:42:03.9139713Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9139750Z     with policy():
2025-12-04T12:42:03.9139902Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9139943Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9140349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 950009856 and is now 2587885568.
2025-12-04T12:42:03.9140364Z 
2025-12-04T12:42:03.9140440Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9140736Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9140738Z 
2025-12-04T12:42:03.9140826Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9140828Z 
2025-12-04T12:42:03.9140830Z 
2025-12-04T12:42:03.9140904Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9140992Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9141261Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-889307325d7c8e37.xml -
2025-12-04T12:42:03.9141323Z =========================== short test summary info ============================
2025-12-04T12:42:03.9141631Z FAILED [8.6153s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9141678Z Traceback (most recent call last):
2025-12-04T12:42:03.9141843Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9141885Z     getattr(self, test_name)()
2025-12-04T12:42:03.9142045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9142081Z     fn()
2025-12-04T12:42:03.9142232Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9142273Z     method(*args, **kwargs)
2025-12-04T12:42:03.9142423Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9142462Z     method(*args, **kwargs)
2025-12-04T12:42:03.9142613Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9142648Z     with policy():
2025-12-04T12:42:03.9142815Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9142856Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9143248Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 950009856 and is now 2587885568.
2025-12-04T12:42:03.9143251Z 
2025-12-04T12:42:03.9143325Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9143629Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9143631Z 
2025-12-04T12:42:03.9143718Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9143783Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9143843Z ======================= 1 failed, 13 deselected in 8.78s =======================
2025-12-04T12:42:03.9143892Z Got exit code 1
2025-12-04T12:42:03.9143931Z Retrying single test...
2025-12-04T12:42:03.9144158Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-47111fc25541c005.xml
2025-12-04T12:42:03.9144226Z ============================= test session starts ==============================
2025-12-04T12:42:03.9144339Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9144380Z cachedir: .pytest_cache
2025-12-04T12:42:03.9144538Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9144584Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9144623Z configfile: pytest.ini
2025-12-04T12:42:03.9144787Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9145144Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9145195Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9145541Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9145598Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9145652Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9145943Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9145988Z Running 1 items in this shard
2025-12-04T12:42:03.9145990Z 
2025-12-04T12:42:03.9146359Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 12:41:07.191000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 477956
2025-12-04T12:42:03.9146515Z I1204 12:41:07.192000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 477957
2025-12-04T12:42:03.9146667Z I1204 12:41:07.193000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 477958
2025-12-04T12:42:03.9146817Z I1204 12:41:07.193000 477887 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 477959
2025-12-04T12:42:03.9147504Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9147550Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9148268Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9148322Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9148992Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9149047Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9149714Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9149757Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9150253Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9150301Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9150791Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9150839Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9151328Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9151374Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9151871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9151918Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9152053Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9152208Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9152500Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9152647Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9152925Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9153052Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9153334Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9153476Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9153749Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9153893Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9154161Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9154291Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9154562Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9154701Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9155217Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9155326Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9155515Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9155930Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9156057Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9156263Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9156421Z E1204 12:41:14.574000 477959 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9156552Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9156703Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9156993Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9157139Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9157416Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9157559Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9157828Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9157969Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9158264Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9158407Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9158678Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9158807Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9159078Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9159217Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9159729Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9159836Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9160027Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9160456Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9160563Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9160768Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9160926Z E1204 12:41:14.582000 477957 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9161055Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9161218Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9161497Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9161654Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9161930Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9162058Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9162326Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9162466Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9162733Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9162873Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9163142Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9163271Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9163540Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9163680Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9164190Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9164296Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9164485Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9164907Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9165015Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9165218Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9165374Z E1204 12:41:14.603000 477958 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9165514Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9165666Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9165945Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9166111Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9166386Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9166500Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9166769Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9166908Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9167176Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9167316Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9167582Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9167709Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9167980Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9168121Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9168674Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9168780Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9169004Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9169416Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9169524Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9169725Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9169894Z E1204 12:41:14.630000 477956 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9169935Z FAILED [8.6147s] [100%]
2025-12-04T12:42:03.9169937Z 
2025-12-04T12:42:03.9169993Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9170133Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.9170193Z Traceback (most recent call last):
2025-12-04T12:42:03.9170375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9170418Z     self._join_processes(fn)
2025-12-04T12:42:03.9170591Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9170644Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9170822Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9170865Z     raise RuntimeError(error)
2025-12-04T12:42:03.9170946Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9170991Z Traceback (most recent call last):
2025-12-04T12:42:03.9171153Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9171195Z     getattr(self, test_name)()
2025-12-04T12:42:03.9171355Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9171388Z     fn()
2025-12-04T12:42:03.9171539Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9171580Z     method(*args, **kwargs)
2025-12-04T12:42:03.9171732Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9171771Z     method(*args, **kwargs)
2025-12-04T12:42:03.9171922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9171959Z     with policy():
2025-12-04T12:42:03.9172112Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9172153Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9172550Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9172552Z 
2025-12-04T12:42:03.9172628Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9172921Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9172934Z 
2025-12-04T12:42:03.9173023Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9173026Z 
2025-12-04T12:42:03.9173028Z 
2025-12-04T12:42:03.9173102Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9173190Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9173460Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-47111fc25541c005.xml -
2025-12-04T12:42:03.9173531Z =========================== short test summary info ============================
2025-12-04T12:42:03.9173839Z FAILED [8.6147s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9173887Z Traceback (most recent call last):
2025-12-04T12:42:03.9174051Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9174109Z     getattr(self, test_name)()
2025-12-04T12:42:03.9174283Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9174317Z     fn()
2025-12-04T12:42:03.9174469Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9174508Z     method(*args, **kwargs)
2025-12-04T12:42:03.9174660Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9174699Z     method(*args, **kwargs)
2025-12-04T12:42:03.9174852Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9174889Z     with policy():
2025-12-04T12:42:03.9175041Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9175081Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9175474Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9175477Z 
2025-12-04T12:42:03.9175552Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9175846Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9175849Z 
2025-12-04T12:42:03.9175936Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9175999Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9176061Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.9176098Z Got exit code 1
2025-12-04T12:42:03.9176138Z Retrying single test...
2025-12-04T12:42:03.9176362Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f6f712e096927ea2.xml
2025-12-04T12:42:03.9176421Z ============================= test session starts ==============================
2025-12-04T12:42:03.9176533Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9176574Z cachedir: .pytest_cache
2025-12-04T12:42:03.9176743Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9176790Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9176830Z configfile: pytest.ini
2025-12-04T12:42:03.9176994Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9177357Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9177406Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9177762Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9177820Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9177876Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9178245Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9178306Z Running 1 items in this shard
2025-12-04T12:42:03.9178308Z 
2025-12-04T12:42:03.9178675Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda I1204 12:41:18.316000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 478358
2025-12-04T12:42:03.9178829Z I1204 12:41:18.317000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 478359
2025-12-04T12:42:03.9178980Z I1204 12:41:18.318000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 478360
2025-12-04T12:42:03.9179131Z I1204 12:41:18.318000 478289 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 478361
2025-12-04T12:42:03.9179814Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9179858Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9180527Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9180572Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9181238Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9181279Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9181959Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:80: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9182003Z   FSDP.set_state_dict_type(
2025-12-04T12:42:03.9182518Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9182565Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9183055Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9183123Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9183610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9183656Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9184141Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_optim_utils.py:1190: UserWarning: `_get_pg_default_device` will be deprecated, it only stays for backward-compatiblity reason. If you need to find a device for object collectives, please use `_get_object_coll_device`. If you need to query the device types supported by group, please use `_device_capability(group)`. 
2025-12-04T12:42:03.9184189Z   device = _get_pg_default_device(group)
2025-12-04T12:42:03.9184323Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9184477Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9184764Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9184913Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9185192Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9185310Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9185581Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9185720Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9186000Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9186141Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9186409Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9186537Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9186819Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9186961Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9187482Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 954204160 and is now 2587885568.
2025-12-04T12:42:03.9187614Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9187803Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9188244Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9188353Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9188558Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9188717Z E1204 12:41:25.867000 478361 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9188848Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9189000Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9189279Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9189427Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9189706Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9189821Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9190090Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9190246Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9190514Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9190655Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9190923Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9191169Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9191441Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9191580Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9192106Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2734686208.
2025-12-04T12:42:03.9192233Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9192421Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9192834Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9192942Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9193145Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9193301Z E1204 12:41:25.891000 478358 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9193432Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9193585Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9193862Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9194010Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9194287Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9194404Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9194682Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9194823Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9195091Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9195232Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9195511Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9195638Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9195909Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9196058Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9196579Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9196687Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9196876Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9197289Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9197396Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9197598Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9197754Z E1204 12:41:25.894000 478359 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9197884Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9198035Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9198355Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9198502Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9198779Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9198894Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9199177Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9199318Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9199586Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9199737Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9200005Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9200132Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9200417Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9200572Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9201083Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9201191Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9201381Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9201796Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9201902Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9202104Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9202260Z E1204 12:41:25.922000 478360 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9202300Z FAILED [9.0144s] [100%]
2025-12-04T12:42:03.9202303Z 
2025-12-04T12:42:03.9202358Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9202500Z _ TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda _
2025-12-04T12:42:03.9202545Z Traceback (most recent call last):
2025-12-04T12:42:03.9202709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9202752Z     self._join_processes(fn)
2025-12-04T12:42:03.9202926Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9202979Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9203167Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9203211Z     raise RuntimeError(error)
2025-12-04T12:42:03.9203291Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9203336Z Traceback (most recent call last):
2025-12-04T12:42:03.9203498Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9203542Z     getattr(self, test_name)()
2025-12-04T12:42:03.9203701Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9203736Z     fn()
2025-12-04T12:42:03.9203898Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9203939Z     method(*args, **kwargs)
2025-12-04T12:42:03.9204090Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9204131Z     method(*args, **kwargs)
2025-12-04T12:42:03.9204290Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9204328Z     with policy():
2025-12-04T12:42:03.9204490Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9204531Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9204924Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9204926Z 
2025-12-04T12:42:03.9205001Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9205293Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9205297Z 
2025-12-04T12:42:03.9205385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9205388Z 
2025-12-04T12:42:03.9205390Z 
2025-12-04T12:42:03.9205466Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9205552Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9205825Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-f6f712e096927ea2.xml -
2025-12-04T12:42:03.9205884Z =========================== short test summary info ============================
2025-12-04T12:42:03.9206195Z FAILED [9.0144s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9206242Z Traceback (most recent call last):
2025-12-04T12:42:03.9206405Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9206449Z     getattr(self, test_name)()
2025-12-04T12:42:03.9206609Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9206643Z     fn()
2025-12-04T12:42:03.9206795Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9206835Z     method(*args, **kwargs)
2025-12-04T12:42:03.9206985Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9207041Z     method(*args, **kwargs)
2025-12-04T12:42:03.9207190Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9207228Z     with policy():
2025-12-04T12:42:03.9207380Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9207421Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9207821Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9207824Z 
2025-12-04T12:42:03.9207899Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9208217Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9208238Z 
2025-12-04T12:42:03.9208327Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9208406Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9208469Z ======================= 1 failed, 14 deselected in 9.15s =======================
2025-12-04T12:42:03.9208507Z Got exit code 1
2025-12-04T12:42:03.9208754Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda
2025-12-04T12:42:03.9208882Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.9209106Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-97ab67582658c2cb.xml
2025-12-04T12:42:03.9209165Z ============================= test session starts ==============================
2025-12-04T12:42:03.9209277Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9209319Z cachedir: .pytest_cache
2025-12-04T12:42:03.9209476Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9209522Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9209562Z configfile: pytest.ini
2025-12-04T12:42:03.9209725Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9210081Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9210133Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9210477Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9210536Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9210593Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9210645Z stepcurrent: skipping 14 already run items.
2025-12-04T12:42:03.9210688Z Running 1 items in this shard
2025-12-04T12:42:03.9210691Z 
2025-12-04T12:42:03.9211040Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 12:41:29.866000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 478760
2025-12-04T12:42:03.9211196Z I1204 12:41:29.866000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 478761
2025-12-04T12:42:03.9211347Z I1204 12:41:29.867000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 478762
2025-12-04T12:42:03.9211498Z I1204 12:41:29.868000 478691 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 478763
2025-12-04T12:42:03.9212230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9212325Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9213041Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9213145Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9213852Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9213943Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9214643Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9214732Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9214866Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9215022Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9215305Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9215452Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9215733Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9215860Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9216131Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9216273Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9216552Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9216692Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9216961Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
﻿2025-12-04T12:42:03.9218866Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9219161Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9219303Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9219786Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2587885568.
2025-12-04T12:42:03.9219898Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9220108Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9220488Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9220596Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9220799Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9220957Z E1204 12:41:37.326000 478763 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9221087Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9221242Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9221519Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9221666Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9221959Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9222077Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9222347Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9222489Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9222771Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9222910Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9223180Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9223391Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9223661Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9223802Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9224279Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9224388Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9224576Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9224957Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9225062Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9225267Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9225423Z E1204 12:41:37.338000 478761 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9225555Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9225708Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9225987Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9226134Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9226420Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9226536Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9226806Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9226955Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9227226Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9227366Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9227633Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9227789Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9228060Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9228233Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9228744Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664.
2025-12-04T12:42:03.9228854Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9229042Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9229417Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9229523Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9229726Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9229883Z E1204 12:41:37.370000 478760 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9230014Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9230164Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9230444Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9230608Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9230884Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9231000Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9231267Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9231419Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9231688Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9231827Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9232111Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9232251Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9232521Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9232660Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9233134Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9233242Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9233430Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9233808Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9233915Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9234117Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9234275Z E1204 12:41:37.384000 478762 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9234315Z FAILED [8.8128s] [100%]
2025-12-04T12:42:03.9234318Z 
2025-12-04T12:42:03.9234372Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9234482Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____
2025-12-04T12:42:03.9234528Z Traceback (most recent call last):
2025-12-04T12:42:03.9234691Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9234746Z     self._join_processes(fn)
2025-12-04T12:42:03.9234920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9234974Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9235154Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9235198Z     raise RuntimeError(error)
2025-12-04T12:42:03.9235278Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9235323Z Traceback (most recent call last):
2025-12-04T12:42:03.9235494Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9235537Z     getattr(self, test_name)()
2025-12-04T12:42:03.9235697Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9235732Z     fn()
2025-12-04T12:42:03.9235882Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9235935Z     method(*args, **kwargs)
2025-12-04T12:42:03.9236095Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9236135Z     method(*args, **kwargs)
2025-12-04T12:42:03.9236283Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9236321Z     with policy():
2025-12-04T12:42:03.9236474Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9236515Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9236875Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9236879Z 
2025-12-04T12:42:03.9236956Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9237216Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9237218Z 
2025-12-04T12:42:03.9237306Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9237309Z 
2025-12-04T12:42:03.9237369Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9237414Z Traceback (most recent call last):
2025-12-04T12:42:03.9237576Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9237618Z     getattr(self, test_name)()
2025-12-04T12:42:03.9237777Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9237812Z     fn()
2025-12-04T12:42:03.9237963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9238003Z     method(*args, **kwargs)
2025-12-04T12:42:03.9238191Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9238229Z     method(*args, **kwargs)
2025-12-04T12:42:03.9238379Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9238416Z     with policy():
2025-12-04T12:42:03.9238567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9238621Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9238980Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2587885568.
2025-12-04T12:42:03.9238984Z 
2025-12-04T12:42:03.9239058Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9239316Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9239336Z 
2025-12-04T12:42:03.9239424Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9239426Z 
2025-12-04T12:42:03.9239428Z 
2025-12-04T12:42:03.9239503Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9239592Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9239860Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-97ab67582658c2cb.xml -
2025-12-04T12:42:03.9239948Z =========================== short test summary info ============================
2025-12-04T12:42:03.9240222Z FAILED [8.8128s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9240269Z Traceback (most recent call last):
2025-12-04T12:42:03.9240431Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9240473Z     getattr(self, test_name)()
2025-12-04T12:42:03.9240633Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9240667Z     fn()
2025-12-04T12:42:03.9240817Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9240859Z     method(*args, **kwargs)
2025-12-04T12:42:03.9241009Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9241047Z     method(*args, **kwargs)
2025-12-04T12:42:03.9241196Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9241233Z     with policy():
2025-12-04T12:42:03.9241385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9241424Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9241782Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9241786Z 
2025-12-04T12:42:03.9241858Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9242113Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9242115Z 
2025-12-04T12:42:03.9242201Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9242203Z 
2025-12-04T12:42:03.9242263Z Process 3 exited with error code 10 and exception:
2025-12-04T12:42:03.9242307Z Traceback (most recent call last):
2025-12-04T12:42:03.9242479Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9242521Z     getattr(self, test_name)()
2025-12-04T12:42:03.9244588Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9244631Z     fn()
2025-12-04T12:42:03.9244784Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9244825Z     method(*args, **kwargs)
2025-12-04T12:42:03.9244974Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9245038Z     method(*args, **kwargs)
2025-12-04T12:42:03.9245187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9245224Z     with policy():
2025-12-04T12:42:03.9245377Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9245419Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9245777Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1098907648 and is now 2587885568.
2025-12-04T12:42:03.9245805Z 
2025-12-04T12:42:03.9245880Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9246350Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9246352Z 
2025-12-04T12:42:03.9246440Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9246503Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9246567Z ======================= 1 failed, 14 deselected in 8.95s =======================
2025-12-04T12:42:03.9246603Z Got exit code 1
2025-12-04T12:42:03.9246644Z Retrying single test...
2025-12-04T12:42:03.9246982Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-166946d282ac9173.xml
2025-12-04T12:42:03.9247042Z ============================= test session starts ==============================
2025-12-04T12:42:03.9247157Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9247197Z cachedir: .pytest_cache
2025-12-04T12:42:03.9247358Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9247405Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9247445Z configfile: pytest.ini
2025-12-04T12:42:03.9247610Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9247973Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9248027Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9248412Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9248469Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9248525Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9248802Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9248849Z Running 1 items in this shard
2025-12-04T12:42:03.9248852Z 
2025-12-04T12:42:03.9249189Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 12:41:41.095000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 479162
2025-12-04T12:42:03.9249347Z I1204 12:41:41.096000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 479163
2025-12-04T12:42:03.9249516Z I1204 12:41:41.096000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 479164
2025-12-04T12:42:03.9249667Z I1204 12:41:41.097000 479093 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 479165
2025-12-04T12:42:03.9250389Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9250515Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9251229Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9251322Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9252029Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9252120Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9252829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9252918Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9253056Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9253211Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9253507Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9253656Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9253936Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9254053Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9254334Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9254478Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9254747Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9254899Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9255177Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9255307Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9255578Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9255720Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9256205Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9256316Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9256508Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9256888Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9256999Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9257207Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9257364Z E1204 12:41:48.469000 479163 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9257494Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9257646Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9257943Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9258089Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9258399Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9258515Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9258795Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9258937Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9259205Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9259375Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9259642Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9259770Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9260043Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9260184Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9260661Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9260769Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9260960Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9261337Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9261446Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9261648Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9261807Z E1204 12:41:48.477000 479164 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9261937Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9262105Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9262386Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9262532Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9262808Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9262933Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9263203Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9263343Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9263620Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9263772Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9264041Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9264170Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9264441Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9264582Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9265058Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1262485504 and is now 2587885568.
2025-12-04T12:42:03.9265165Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9265356Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9265730Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9265840Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9266040Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9266198Z E1204 12:41:48.505000 479165 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9266327Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9266490Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9266768Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9266915Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9267203Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9267316Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9267584Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9267723Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9268011Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9268210Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9268477Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9268605Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9268874Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9269017Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9269489Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664.
2025-12-04T12:42:03.9269596Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9269784Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9270158Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9270267Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9270469Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9270627Z E1204 12:41:48.516000 479162 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9270666Z FAILED [8.6142s] [100%]
2025-12-04T12:42:03.9270682Z 
2025-12-04T12:42:03.9270740Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9270847Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____
2025-12-04T12:42:03.9270895Z Traceback (most recent call last):
2025-12-04T12:42:03.9271059Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9271104Z     self._join_processes(fn)
2025-12-04T12:42:03.9271278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9271344Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9271523Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9271566Z     raise RuntimeError(error)
2025-12-04T12:42:03.9271646Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9271691Z Traceback (most recent call last):
2025-12-04T12:42:03.9271853Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9271926Z     getattr(self, test_name)()
2025-12-04T12:42:03.9272085Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9272119Z     fn()
2025-12-04T12:42:03.9272271Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9272312Z     method(*args, **kwargs)
2025-12-04T12:42:03.9272462Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9272501Z     method(*args, **kwargs)
2025-12-04T12:42:03.9272653Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9272690Z     with policy():
2025-12-04T12:42:03.9272843Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9272884Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9273243Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9273246Z 
2025-12-04T12:42:03.9273321Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9273577Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9273581Z 
2025-12-04T12:42:03.9273668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9273670Z 
2025-12-04T12:42:03.9273672Z 
2025-12-04T12:42:03.9273747Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9273835Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9274107Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-166946d282ac9173.xml -
2025-12-04T12:42:03.9274168Z =========================== short test summary info ============================
2025-12-04T12:42:03.9274443Z FAILED [8.6142s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:42:03.9274489Z Traceback (most recent call last):
2025-12-04T12:42:03.9274664Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9274707Z     getattr(self, test_name)()
2025-12-04T12:42:03.9274867Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9274902Z     fn()
2025-12-04T12:42:03.9275053Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9275092Z     method(*args, **kwargs)
2025-12-04T12:42:03.9275253Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9275292Z     method(*args, **kwargs)
2025-12-04T12:42:03.9275441Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9275479Z     with policy():
2025-12-04T12:42:03.9275631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9275683Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9276040Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9276054Z 
2025-12-04T12:42:03.9276128Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9276384Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9276385Z 
2025-12-04T12:42:03.9276471Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9276535Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9276596Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.9276635Z Got exit code 1
2025-12-04T12:42:03.9276675Z Retrying single test...
2025-12-04T12:42:03.9276903Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9a57974f2962ab4b.xml
2025-12-04T12:42:03.9276960Z ============================= test session starts ==============================
2025-12-04T12:42:03.9277073Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9277114Z cachedir: .pytest_cache
2025-12-04T12:42:03.9277272Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9277319Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9277358Z configfile: pytest.ini
2025-12-04T12:42:03.9277522Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9277880Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9277931Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9278316Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9278376Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9278448Z collected 15 items / 14 deselected / 1 selected
2025-12-04T12:42:03.9278699Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9278744Z Running 1 items in this shard
2025-12-04T12:42:03.9278748Z 
2025-12-04T12:42:03.9279080Z distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda I1204 12:41:52.213000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 479564
2025-12-04T12:42:03.9279292Z I1204 12:41:52.214000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 479565
2025-12-04T12:42:03.9279443Z I1204 12:41:52.214000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 479566
2025-12-04T12:42:03.9279594Z I1204 12:41:52.215000 479495 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 479567
2025-12-04T12:42:03.9280314Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9280434Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9281149Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9281238Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9281942Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9282031Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9282735Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:822: FutureWarning: FSDP.state_dict_type() and FSDP.set_state_dict_type() are being deprecated. Please use APIs, get_state_dict() and set_state_dict(), which can support different parallelisms, FSDP1, FSDP2, DDP. API doc: https://pytorch.org/docs/stable/distributed.checkpoint.html#torch.distributed.checkpoint.state_dict.get_state_dict .Tutorial: https://pytorch.org/tutorials/recipes/distributed_checkpoint_recipe.html .
2025-12-04T12:42:03.9282825Z   prev_state_dict_settings = FullyShardedDataParallel.set_state_dict_type(
2025-12-04T12:42:03.9282959Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9283115Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9283406Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9283554Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9283833Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9283948Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9284228Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9284370Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9284637Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9284798Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9285068Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9285197Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9285467Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9285607Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9286084Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 3. CUDA driver allocated memory was 1254096896 and is now 2587885568.
2025-12-04T12:42:03.9286194Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9286382Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9286759Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9286869Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9287072Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9287229Z E1204 12:41:59.653000 479567 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:42:03.9287358Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9287520Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9287798Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9287946Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9288442Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9288579Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9288849Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9288988Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9289276Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9289430Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9289697Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9289824Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9290093Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9290235Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9290711Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9290820Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9291008Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9291383Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9291490Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9291692Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9291850Z E1204 12:41:59.666000 479566 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:42:03.9291978Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9292146Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9292423Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9292572Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9292867Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9292983Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9293252Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9293393Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9293690Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9293829Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9294098Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9294226Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9294497Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9294639Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9295114Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 1. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9295221Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9295409Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9295785Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9295891Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9296095Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9296251Z E1204 12:41:59.677000 479565 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:42:03.9296392Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:42:03.9296545Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:42:03.9296826Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9296972Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:42:03.9297259Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9297374Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:42:03.9297641Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9297802Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9298070Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9298248Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:42:03.9298517Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9298644Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:42:03.9298915Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9299056Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:42:03.9299531Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 0. CUDA driver allocated memory was 1421869056 and is now 2740977664.
2025-12-04T12:42:03.9299643Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9299831Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9300209Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9300314Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:42:03.9300518Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9300687Z E1204 12:41:59.704000 479564 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:42:03.9300729Z FAILED [8.6145s] [100%]
2025-12-04T12:42:03.9300731Z 
2025-12-04T12:42:03.9300787Z =================================== FAILURES ===================================
2025-12-04T12:42:03.9300898Z ___ TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda ____
2025-12-04T12:42:03.9300945Z Traceback (most recent call last):
2025-12-04T12:42:03.9301109Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:42:03.9301152Z     self._join_processes(fn)
2025-12-04T12:42:03.9301340Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:42:03.9301393Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:42:03.9301574Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:42:03.9301618Z     raise RuntimeError(error)
2025-12-04T12:42:03.9301698Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.9301757Z Traceback (most recent call last):
2025-12-04T12:42:03.9301918Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9301975Z     getattr(self, test_name)()
2025-12-04T12:42:03.9302134Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9302170Z     fn()
2025-12-04T12:42:03.9302322Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9302363Z     method(*args, **kwargs)
2025-12-04T12:42:03.9302514Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9302555Z     method(*args, **kwargs)
2025-12-04T12:42:03.9302705Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9302744Z     with policy():
2025-12-04T12:42:03.9302898Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9302939Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9303294Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9303297Z 
2025-12-04T12:42:03.9303373Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9303629Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9303633Z 
2025-12-04T12:42:03.9303721Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9303724Z 
2025-12-04T12:42:03.9303727Z 
2025-12-04T12:42:03.9303802Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:42:03.9303890Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:42:03.9304162Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-9a57974f2962ab4b.xml -
2025-12-04T12:42:03.9304221Z =========================== short test summary info ============================
2025-12-04T12:42:03.9304507Z FAILED [8.6145s] distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:42:03.9304555Z Traceback (most recent call last):
2025-12-04T12:42:03.9304718Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:42:03.9304763Z     getattr(self, test_name)()
2025-12-04T12:42:03.9304922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:42:03.9304956Z     fn()
2025-12-04T12:42:03.9305107Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9305165Z     method(*args, **kwargs)
2025-12-04T12:42:03.9305316Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:42:03.9305356Z     method(*args, **kwargs)
2025-12-04T12:42:03.9305504Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:42:03.9305542Z     with policy():
2025-12-04T12:42:03.9305693Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:42:03.9305756Z     raise RuntimeError(msg)
2025-12-04T12:42:03.9306113Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda! Caching allocator allocated memory was 0 and is now reported as 7680 on device 2. CUDA driver allocated memory was 1268776960 and is now 2587885568.
2025-12-04T12:42:03.9306116Z 
2025-12-04T12:42:03.9306190Z To execute this test, run the following from the base repo dir:
2025-12-04T12:42:03.9306446Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_dtensor_state_dict.py TestFSDPWithDeviceMeshAndDTensorCUDA.test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9306449Z 
2025-12-04T12:42:03.9306534Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:42:03.9306597Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:42:03.9306657Z ======================= 1 failed, 14 deselected in 8.75s =======================
2025-12-04T12:42:03.9306697Z Got exit code 1
2025-12-04T12:42:03.9306903Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda
2025-12-04T12:42:03.9307031Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:42:03.9307257Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c23a818ffc04ad44.xml
2025-12-04T12:42:03.9307315Z ============================= test session starts ==============================
2025-12-04T12:42:03.9307427Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:42:03.9307469Z cachedir: .pytest_cache
2025-12-04T12:42:03.9307626Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:42:03.9307673Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:42:03.9307712Z configfile: pytest.ini
2025-12-04T12:42:03.9307876Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:42:03.9308274Z collecting ... /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:31: PytestCollectionWarning: cannot collect test class 'TestDummyModel' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9308324Z   class TestDummyModel(torch.nn.Module):
2025-12-04T12:42:03.9308683Z /var/lib/jenkins/pytorch/test/distributed/fsdp/test_fsdp_dtensor_state_dict.py:47: PytestCollectionWarning: cannot collect test class 'TestDummyModelUneven' because it has a __init__ constructor (from: test/distributed/fsdp/test_fsdp_dtensor_state_dict.py)
2025-12-04T12:42:03.9308741Z   class TestDummyModelUneven(torch.nn.Module):
2025-12-04T12:42:03.9308799Z collected 15 items / 15 deselected / 0 selected
2025-12-04T12:42:03.9308852Z stepcurrent: skipping 15 already run items.
2025-12-04T12:42:03.9308896Z Running 0 items in this shard
2025-12-04T12:42:03.9308898Z 
2025-12-04T12:42:03.9309181Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_dtensor_state_dict/distributed.fsdp.test_fsdp_dtensor_state_dict-c23a818ffc04ad44.xml -
2025-12-04T12:42:03.9309241Z ============================ 15 deselected in 0.01s ============================
2025-12-04T12:42:03.9313004Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_model_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_optim_load_state_dict_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_False_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_dtensor_sharded_tensor_state_dict_identical_offload_to_cpu_True_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_False_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_fsdp_init_with_device_mesh_is_even_sharded_model_True_cuda', 'test/distributed/fsdp/test_fsdp_dtensor_state_dict.py::TestFSDPWithDeviceMeshAndDTensorCUDA::test_raises_warning_or_errors_cuda']
2025-12-04T12:42:03.9313039Z 
2025-12-04T12:42:03.9313257Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 (test/test-reports/distributed.fsdp.test_fsdp_dtensor_state_dict_1.1_429921b2f227c24a_.log)
2025-12-04T12:42:03.9313260Z 
2025-12-04T12:42:03.9313410Z Finished distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 ... [2025-12-04 12:42:03.732641][2291622.381822912], took 8.55min
2025-12-04T12:42:03.9313672Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:42:03.9313762Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:42:03.9313857Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:42:03.9313903Z Uploading artifacts took 0.00 seconds
2025-12-04T12:42:03.9313976Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 failed!
2025-12-04T12:42:03.9314101Z Running distributed/fsdp/test_fsdp_comm_hooks 1/1 ... [2025-12-04 12:42:03.735809][2291622.384993073]
2025-12-04T12:42:03.9314153Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:42:03.9314473Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm_hooks.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:42:03.735983]
2025-12-04T12:44:59.6648062Z 
2025-12-04T12:44:59.6649101Z distributed/fsdp/test_fsdp_comm_hooks 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_comm_hooks_1.1_97288ec4bda6c925_.log
2025-12-04T12:44:59.6655216Z Running 28 items in this shard: test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_bf16_hook_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_behavior_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_default_communication_hook_initialization_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_False_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_fp16_hook_has_wrapping_True_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_hybrid_strategy, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_non_root_sharding_strategy2, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy0, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy1, test/distributed/fsdp/test_fsdp_comm_hooks.py::TestCommunicationHooks::test_registering_hook_submodules_sharding_strategy2
2025-12-04T12:44:59.6660825Z 
2025-12-04T12:44:59.6660972Z Finished distributed/fsdp/test_fsdp_comm_hooks 1/1 ... [2025-12-04 12:44:59.664665][2291798.313846272], took 2.93min
2025-12-04T12:44:59.6669529Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:44:59.6688699Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:44:59.6692233Z Running distributed/fsdp/test_fsdp_hybrid_shard 1/1 ... [2025-12-04 12:44:59.669038][2291798.318221154]
2025-12-04T12:44:59.6692450Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:44:59.6693738Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_hybrid_shard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:44:59.669267]
2025-12-04T12:46:00.8655994Z 
2025-12-04T12:46:00.8657439Z distributed/fsdp/test_fsdp_hybrid_shard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_hybrid_shard_1.1_dbce83217519cbf5_.log
2025-12-04T12:46:00.8661047Z Running 6 items in this shard: test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_fsdp_hybrid_shard_basic_setup, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_fsdp_hybrid_shard_parity, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_hsdp_save_load_state_dict, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_hsdp_sync_module_state, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_invalid_pg_specification_raises, test/distributed/fsdp/test_fsdp_hybrid_shard.py::TestFSDPHybridShard::test_raises_manual_wrap_hybrid_shard_when_none_policy
2025-12-04T12:46:00.8663681Z 
2025-12-04T12:46:00.8664103Z Finished distributed/fsdp/test_fsdp_hybrid_shard 1/1 ... [2025-12-04 12:46:00.865386][2291859.514565652], took 1.02min
2025-12-04T12:46:00.8672985Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:46:00.8690148Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:46:00.8693036Z Running distributed/_shard/test_sharder 1/1 ... [2025-12-04 12:46:00.869214][2291859.518397333]
2025-12-04T12:46:00.8693235Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:46:00.8695160Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/test_sharder.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:00.869412]
2025-12-04T12:46:12.7521034Z 
2025-12-04T12:46:12.7522972Z distributed/_shard/test_sharder 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.test_sharder_1.1_1eaaa65535e8636d_.log
2025-12-04T12:46:12.7524290Z Running 2 items in this shard: test/distributed/_shard/test_sharder.py::TestCustomSharder::test_custom_sharder, test/distributed/_shard/test_sharder.py::TestCustomSharder::test_custom_sharder_errors
2025-12-04T12:46:12.7524990Z 
2025-12-04T12:46:12.7525287Z Finished distributed/_shard/test_sharder 1/1 ... [2025-12-04 12:46:12.751805][2291871.400984945], took 0.20min
2025-12-04T12:46:12.7539996Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:46:12.7556734Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:46:12.7559993Z Running distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2025-12-04 12:46:12.755870][2291871.405052992]
2025-12-04T12:46:12.7560399Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:46:12.7561871Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_tensor_ops.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:12.756058]
2025-12-04T12:46:37.6071370Z 
2025-12-04T12:46:37.6072898Z distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_tensor_ops_1.1_1dd185e705cadcbb_.log
2025-12-04T12:46:37.6075838Z Running 5 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_clone, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_deep_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_detach, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_inplace_copy, test/distributed/_shard/sharded_tensor/ops/test_tensor_ops.py::TestTensorOps::test_set_requires_grad
2025-12-04T12:46:37.6077858Z 
2025-12-04T12:46:37.6078539Z Finished distributed/_shard/sharded_tensor/ops/test_tensor_ops 1/1 ... [2025-12-04 12:46:37.606944][2291896.25612344], took 0.41min
2025-12-04T12:46:37.6089552Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:46:37.6106581Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:46:37.6110077Z Running distributed/_shard/sharding_plan/test_sharding_plan 1/1 ... [2025-12-04 12:46:37.610884][2291896.26006846]
2025-12-04T12:46:37.6110313Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:46:37.6112049Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharding_plan/test_sharding_plan.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:37.611083]
2025-12-04T12:46:53.8988834Z 
2025-12-04T12:46:53.8990362Z distributed/_shard/sharding_plan/test_sharding_plan 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharding_plan.test_sharding_plan_1.1_5a261ebd1ee62caf_.log
2025-12-04T12:46:53.8992605Z Running 3 items in this shard: test/distributed/_shard/sharding_plan/test_sharding_plan.py::TestShardingPlan::test_custom_sharding_planner, test/distributed/_shard/sharding_plan/test_sharding_plan.py::TestShardingPlan::test_shard_module_sub_process_group, test/distributed/_shard/sharding_plan/test_sharding_plan.py::TestShardingPlan::test_sharding_plan_errors
2025-12-04T12:46:53.8993880Z 
2025-12-04T12:46:53.8994629Z Finished distributed/_shard/sharding_plan/test_sharding_plan 1/1 ... [2025-12-04 12:46:53.898595][2291912.547774341], took 0.27min
2025-12-04T12:46:53.9008954Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:46:53.9026064Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:46:53.9029337Z Running distributed/fsdp/test_fsdp_comm 1/1 ... [2025-12-04 12:46:53.902767][2291912.551951177]
2025-12-04T12:46:53.9029706Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:46:53.9030947Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_comm.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:46:53.902959]
2025-12-04T12:52:45.7327593Z 
2025-12-04T12:52:45.7328947Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm 1/1 (test/test-reports/distributed.fsdp.test_fsdp_comm_1.1_3b36b42e6bf366b5_.log)
2025-12-04T12:52:45.7329882Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bb216e1619e0039d.xml
2025-12-04T12:52:45.7331226Z ============================= test session starts ==============================
2025-12-04T12:52:45.7331824Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7332223Z cachedir: .pytest_cache
2025-12-04T12:52:45.7332696Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7333209Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7333458Z configfile: pytest.ini
2025-12-04T12:52:45.7333900Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7334337Z collecting ... collected 10 items
2025-12-04T12:52:45.7334599Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T12:52:45.7338266Z Running 10 items in this shard: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda, test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.7341920Z 
2025-12-04T12:52:45.7342569Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda I1204 12:46:55.740000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 495924
2025-12-04T12:52:45.7343544Z I1204 12:46:55.741000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 495925
2025-12-04T12:52:45.7344163Z I1204 12:46:55.742000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 495926
2025-12-04T12:52:45.7344675Z I1204 12:46:55.742000 495855 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 495927
2025-12-04T12:52:45.7345438Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7346094Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7346751Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7347334Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7348122Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7349050Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7350342Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7351120Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7351723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7352307Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7353070Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7353851Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7354326Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7354788Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7355390Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7356005Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7356271Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7356664Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7357274Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7357823Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7358411Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7376181Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7376948Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7377441Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7377944Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7378485Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7378953Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7379406Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7379866Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7380345Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7381068Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7381743Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7382100Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7382744Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7383303Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7383671Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7384091Z [rank1]:E1204 12:47:03.147000 495925 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7384366Z dist init r=1, world=4
2025-12-04T12:52:45.7384580Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7384922Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7385415Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7385918Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7386401Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7386859Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7387315Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7387795Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7388313Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7388775Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7389242Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7389696Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7390159Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7390628Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7391334Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7392002Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7392357Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7392998Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7393552Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7393962Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7394382Z [rank2]:E1204 12:47:03.153000 495926 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7394630Z dist init r=2, world=4
2025-12-04T12:52:45.7394840Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7395429Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7395918Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7396406Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7396891Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7397371Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7397807Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7398307Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7398774Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7399236Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7399699Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7400153Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7400614Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7401093Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7401808Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7402482Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7402842Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7403507Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7404074Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7404450Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7404887Z [rank3]:E1204 12:47:03.294000 495927 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7405142Z dist init r=3, world=4
2025-12-04T12:52:45.7405356Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7405705Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7406203Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7406731Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7407222Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7407679Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7408133Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7408669Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7409149Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7409625Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7410100Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7410564Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7411031Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7411509Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7412225Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7412925Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7413287Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7413936Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7414512Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7414884Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7415312Z [rank0]:E1204 12:47:03.454000 495924 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7415564Z dist init r=0, world=4
2025-12-04T12:52:45.7416005Z [rank0]:[W1204 12:47:03.488733552 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7416436Z FAILED [9.3139s] [ 10%]
2025-12-04T12:52:45.7416512Z 
2025-12-04T12:52:45.7416574Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7416821Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda _
2025-12-04T12:52:45.7417050Z Traceback (most recent call last):
2025-12-04T12:52:45.7417307Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7417562Z     self._join_processes(fn)
2025-12-04T12:52:45.7417814Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7418084Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7418400Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7418670Z     raise RuntimeError(error)
2025-12-04T12:52:45.7418829Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7419003Z Traceback (most recent call last):
2025-12-04T12:52:45.7419252Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7419495Z     getattr(self, test_name)()
2025-12-04T12:52:45.7419737Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7419981Z     fn()
2025-12-04T12:52:45.7420192Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7420439Z     method(*args, **kwargs)
2025-12-04T12:52:45.7420669Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7420901Z     method(*args, **kwargs)
2025-12-04T12:52:45.7421123Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7421361Z     with policy():
2025-12-04T12:52:45.7421586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7421829Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7422322Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7422754Z 
2025-12-04T12:52:45.7422834Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7423230Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7423548Z 
2025-12-04T12:52:45.7423658Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7423795Z 
2025-12-04T12:52:45.7423859Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7423999Z Traceback (most recent call last):
2025-12-04T12:52:45.7424244Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7424489Z     getattr(self, test_name)()
2025-12-04T12:52:45.7424730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7425012Z     fn()
2025-12-04T12:52:45.7425225Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7425455Z     method(*args, **kwargs)
2025-12-04T12:52:45.7425672Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7425899Z     method(*args, **kwargs)
2025-12-04T12:52:45.7426120Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7426347Z     with policy():
2025-12-04T12:52:45.7426564Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7426805Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7427273Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7427707Z 
2025-12-04T12:52:45.7427792Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7428226Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7428535Z 
2025-12-04T12:52:45.7428632Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7428759Z 
2025-12-04T12:52:45.7428763Z 
2025-12-04T12:52:45.7428853Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7429070Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7429445Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bb216e1619e0039d.xml -
2025-12-04T12:52:45.7429789Z =========================== short test summary info ============================
2025-12-04T12:52:45.7430187Z FAILED [9.3139s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7430559Z Traceback (most recent call last):
2025-12-04T12:52:45.7430817Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7431094Z     getattr(self, test_name)()
2025-12-04T12:52:45.7431339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7431583Z     fn()
2025-12-04T12:52:45.7431797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7432038Z     method(*args, **kwargs)
2025-12-04T12:52:45.7432269Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7432509Z     method(*args, **kwargs)
2025-12-04T12:52:45.7432753Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7432987Z     with policy():
2025-12-04T12:52:45.7433210Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7433452Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7433920Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7434391Z 
2025-12-04T12:52:45.7434468Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7434864Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7435177Z 
2025-12-04T12:52:45.7435271Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7435404Z 
2025-12-04T12:52:45.7435466Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7435618Z Traceback (most recent call last):
2025-12-04T12:52:45.7435870Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7436122Z     getattr(self, test_name)()
2025-12-04T12:52:45.7436364Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7436607Z     fn()
2025-12-04T12:52:45.7436817Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7437070Z     method(*args, **kwargs)
2025-12-04T12:52:45.7437297Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7437526Z     method(*args, **kwargs)
2025-12-04T12:52:45.7437748Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7437976Z     with policy():
2025-12-04T12:52:45.7438219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7438456Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7438922Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7439349Z 
2025-12-04T12:52:45.7439424Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7439808Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7440115Z 
2025-12-04T12:52:45.7440225Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7440418Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7440584Z ============================== 1 failed in 9.32s ===============================
2025-12-04T12:52:45.7440721Z Got exit code 1
2025-12-04T12:52:45.7440823Z Retrying single test...
2025-12-04T12:52:45.7441079Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a748046d038cbc77.xml
2025-12-04T12:52:45.7441366Z ============================= test session starts ==============================
2025-12-04T12:52:45.7441600Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7441792Z cachedir: .pytest_cache
2025-12-04T12:52:45.7442022Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7442268Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7442394Z configfile: pytest.ini
2025-12-04T12:52:45.7442623Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7442926Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7443301Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7443644Z Running 1 items in this shard
2025-12-04T12:52:45.7443718Z 
2025-12-04T12:52:45.7444067Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda I1204 12:47:07.944000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 496326
2025-12-04T12:52:45.7444603Z I1204 12:47:07.945000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 496327
2025-12-04T12:52:45.7444948Z I1204 12:47:07.946000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 496328
2025-12-04T12:52:45.7445295Z I1204 12:47:07.946000 496257 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 496329
2025-12-04T12:52:45.7445850Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7446294Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7446874Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7447462Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7447915Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7448414Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7449004Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7449598Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7450054Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7450493Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7451076Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7451656Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7452108Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7452577Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7453150Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7453734Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7453977Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7454321Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7454814Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7455303Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7455788Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7456240Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7456685Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7457154Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7457621Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7458089Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7458608Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7459062Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7459519Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7459991Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7460721Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7461388Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7461742Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7462415Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7462974Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7463342Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7463760Z [rank1]:E1204 12:47:15.252000 496327 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7464005Z dist init r=1, world=4
2025-12-04T12:52:45.7464214Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7464557Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7465052Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7465537Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7466019Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7466477Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7466920Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7467387Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7467855Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7468389Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7468858Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7469322Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7469795Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7470263Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7470977Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7471675Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7472026Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7472667Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7473221Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7473591Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7474008Z [rank0]:E1204 12:47:15.323000 496326 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7474251Z dist init r=0, world=4
2025-12-04T12:52:45.7474460Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7474804Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7475297Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7475782Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7476264Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7476717Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7477158Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7477639Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7478212Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7478677Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7479160Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7479619Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7480076Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7480565Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7481286Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7481956Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7482308Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7482940Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7483494Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7483862Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7484277Z [rank2]:E1204 12:47:15.379000 496328 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7484521Z dist init r=2, world=4
2025-12-04T12:52:45.7484729Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7485067Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7485561Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7486045Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7486526Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7486989Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7487433Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7487903Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7488425Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7488898Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7489366Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7489822Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7490309Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7490780Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7491488Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7492151Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7492506Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7493143Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7493691Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7494060Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7494480Z [rank3]:E1204 12:47:15.410000 496329 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7494723Z dist init r=3, world=4
2025-12-04T12:52:45.7495129Z [rank0]:[W1204 12:47:15.149404416 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7495540Z FAILED [9.1147s] [100%]
2025-12-04T12:52:45.7495605Z 
2025-12-04T12:52:45.7495667Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7495904Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda _
2025-12-04T12:52:45.7496128Z Traceback (most recent call last):
2025-12-04T12:52:45.7496394Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7496640Z     self._join_processes(fn)
2025-12-04T12:52:45.7496891Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7497158Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7497427Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7497688Z     raise RuntimeError(error)
2025-12-04T12:52:45.7497866Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7498033Z Traceback (most recent call last):
2025-12-04T12:52:45.7498322Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7498564Z     getattr(self, test_name)()
2025-12-04T12:52:45.7498800Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7499050Z     fn()
2025-12-04T12:52:45.7499256Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7499506Z     method(*args, **kwargs)
2025-12-04T12:52:45.7499732Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7499967Z     method(*args, **kwargs)
2025-12-04T12:52:45.7500191Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7500421Z     with policy():
2025-12-04T12:52:45.7500636Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7500869Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7501332Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7501759Z 
2025-12-04T12:52:45.7501837Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7502223Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7502531Z 
2025-12-04T12:52:45.7502623Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7502749Z 
2025-12-04T12:52:45.7502754Z 
2025-12-04T12:52:45.7502834Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7503037Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7503395Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a748046d038cbc77.xml -
2025-12-04T12:52:45.7503727Z =========================== short test summary info ============================
2025-12-04T12:52:45.7504121Z FAILED [9.1147s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7504490Z Traceback (most recent call last):
2025-12-04T12:52:45.7504742Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7504986Z     getattr(self, test_name)()
2025-12-04T12:52:45.7505242Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7505479Z     fn()
2025-12-04T12:52:45.7505685Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7505919Z     method(*args, **kwargs)
2025-12-04T12:52:45.7506144Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7506374Z     method(*args, **kwargs)
2025-12-04T12:52:45.7506905Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7507136Z     with policy():
2025-12-04T12:52:45.7507349Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7507582Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7508046Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7508546Z 
2025-12-04T12:52:45.7508621Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7509007Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7509317Z 
2025-12-04T12:52:45.7509408Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7509598Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7509768Z ======================= 1 failed, 9 deselected in 9.13s ========================
2025-12-04T12:52:45.7509909Z Got exit code 1
2025-12-04T12:52:45.7510008Z Retrying single test...
2025-12-04T12:52:45.7510271Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-6fe0148fc13ba808.xml
2025-12-04T12:52:45.7510562Z ============================= test session starts ==============================
2025-12-04T12:52:45.7510776Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7510966Z cachedir: .pytest_cache
2025-12-04T12:52:45.7511193Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7511432Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7511557Z configfile: pytest.ini
2025-12-04T12:52:45.7511790Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7512063Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7512438Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7512790Z Running 1 items in this shard
2025-12-04T12:52:45.7512863Z 
2025-12-04T12:52:45.7513210Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda I1204 12:47:19.735000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 496728
2025-12-04T12:52:45.7513745Z I1204 12:47:19.736000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 496729
2025-12-04T12:52:45.7514090Z I1204 12:47:19.737000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 496730
2025-12-04T12:52:45.7514451Z I1204 12:47:19.738000 496659 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 496731
2025-12-04T12:52:45.7515008Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7515453Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7516051Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7516642Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7517096Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7517566Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7518138Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7518760Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7519218Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7519656Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7520230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7520816Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7521270Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7521711Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7522285Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7522876Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7523123Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7523476Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7523986Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7524475Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7524959Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7525415Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7525883Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7526359Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7526836Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7527341Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7527816Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7528309Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7528773Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7529248Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7529970Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7530645Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7531005Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7531647Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7532217Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7532594Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7533018Z [rank3]:E1204 12:47:26.986000 496731 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7533272Z dist init r=3, world=4
2025-12-04T12:52:45.7533508Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7533857Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7534357Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7534846Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7535345Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7535806Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7536256Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7536760Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7537236Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7537706Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7538215Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7538678Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7539150Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7539623Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7540338Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7541015Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7541374Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7542015Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7542570Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7542953Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7543371Z [rank0]:E1204 12:47:27.056000 496728 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7543619Z dist init r=0, world=4
2025-12-04T12:52:45.7543826Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7544170Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7544678Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7545166Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7545659Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7546150Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7546600Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7547081Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7547557Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7548030Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7548545Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7549006Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7549472Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7549946Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7550654Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7551320Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7551681Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7552330Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7552884Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7553255Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7553677Z [rank2]:E1204 12:47:27.175000 496730 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7553922Z dist init r=2, world=4
2025-12-04T12:52:45.7554143Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7554484Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7554971Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7555485Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7555973Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7556425Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7556871Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7557336Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7557804Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7558316Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7558789Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7559243Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7559700Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7560168Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7560877Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7561541Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7561912Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7562547Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7563097Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7563474Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7563889Z [rank1]:E1204 12:47:27.251000 496729 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7564137Z dist init r=1, world=4
2025-12-04T12:52:45.7564540Z [rank0]:[W1204 12:47:27.919627588 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7564994Z FAILED [9.1145s] [100%]
2025-12-04T12:52:45.7565058Z 
2025-12-04T12:52:45.7565121Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7565355Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda _
2025-12-04T12:52:45.7565580Z Traceback (most recent call last):
2025-12-04T12:52:45.7565839Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7566096Z     self._join_processes(fn)
2025-12-04T12:52:45.7566356Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7566633Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7566911Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7567174Z     raise RuntimeError(error)
2025-12-04T12:52:45.7567330Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7567495Z Traceback (most recent call last):
2025-12-04T12:52:45.7567740Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7567987Z     getattr(self, test_name)()
2025-12-04T12:52:45.7568265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7568506Z     fn()
2025-12-04T12:52:45.7568711Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7568945Z     method(*args, **kwargs)
2025-12-04T12:52:45.7569170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7569410Z     method(*args, **kwargs)
2025-12-04T12:52:45.7569631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7569861Z     with policy():
2025-12-04T12:52:45.7570079Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7570313Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7570795Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7571223Z 
2025-12-04T12:52:45.7571305Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7571698Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7572010Z 
2025-12-04T12:52:45.7572106Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7572234Z 
2025-12-04T12:52:45.7572300Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7572464Z Traceback (most recent call last):
2025-12-04T12:52:45.7572713Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7572961Z     getattr(self, test_name)()
2025-12-04T12:52:45.7573200Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7573438Z     fn()
2025-12-04T12:52:45.7573662Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7573913Z     method(*args, **kwargs)
2025-12-04T12:52:45.7574133Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7574364Z     method(*args, **kwargs)
2025-12-04T12:52:45.7574586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7574812Z     with policy():
2025-12-04T12:52:45.7575024Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7575255Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7575713Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7576141Z 
2025-12-04T12:52:45.7576215Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7576600Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7576907Z 
2025-12-04T12:52:45.7576995Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7577122Z 
2025-12-04T12:52:45.7577124Z 
2025-12-04T12:52:45.7577201Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7577405Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7577762Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-6fe0148fc13ba808.xml -
2025-12-04T12:52:45.7578094Z =========================== short test summary info ============================
2025-12-04T12:52:45.7578516Z FAILED [9.1145s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7578879Z Traceback (most recent call last):
2025-12-04T12:52:45.7579126Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7579369Z     getattr(self, test_name)()
2025-12-04T12:52:45.7579624Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7579862Z     fn()
2025-12-04T12:52:45.7580065Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7580296Z     method(*args, **kwargs)
2025-12-04T12:52:45.7580517Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7580749Z     method(*args, **kwargs)
2025-12-04T12:52:45.7580972Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7581201Z     with policy():
2025-12-04T12:52:45.7581432Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7581663Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7582126Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7582581Z 
2025-12-04T12:52:45.7582664Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7583052Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7583359Z 
2025-12-04T12:52:45.7583455Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7583579Z 
2025-12-04T12:52:45.7583646Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7583795Z Traceback (most recent call last):
2025-12-04T12:52:45.7584045Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7584294Z     getattr(self, test_name)()
2025-12-04T12:52:45.7584533Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7584772Z     fn()
2025-12-04T12:52:45.7584979Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7585214Z     method(*args, **kwargs)
2025-12-04T12:52:45.7585440Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7585676Z     method(*args, **kwargs)
2025-12-04T12:52:45.7585899Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7586132Z     with policy():
2025-12-04T12:52:45.7586349Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7586587Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7587055Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7587481Z 
2025-12-04T12:52:45.7587564Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7587952Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7588301Z 
2025-12-04T12:52:45.7588391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7588606Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7588779Z ======================= 1 failed, 9 deselected in 9.12s ========================
2025-12-04T12:52:45.7588926Z Got exit code 1
2025-12-04T12:52:45.7589213Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.7589603Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.7589964Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-32fc4cd2f4792970.xml
2025-12-04T12:52:45.7590271Z ============================= test session starts ==============================
2025-12-04T12:52:45.7590490Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7590685Z cachedir: .pytest_cache
2025-12-04T12:52:45.7590917Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7591161Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7591304Z configfile: pytest.ini
2025-12-04T12:52:45.7591555Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7591831Z collecting ... collected 10 items / 1 deselected / 9 selected
2025-12-04T12:52:45.7591999Z stepcurrent: skipping 1 already run items.
2025-12-04T12:52:45.7592135Z Running 9 items in this shard
2025-12-04T12:52:45.7592212Z 
2025-12-04T12:52:45.7592565Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda I1204 12:47:31.524000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 497130
2025-12-04T12:52:45.7593104Z I1204 12:47:31.525000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 497131
2025-12-04T12:52:45.7593452Z I1204 12:47:31.526000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 497132
2025-12-04T12:52:45.7593798Z I1204 12:47:31.526000 497061 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 497133
2025-12-04T12:52:45.7594357Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7594810Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7595395Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7595988Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7596449Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7596891Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7597482Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7598076Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7598571Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7599013Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7599465Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7599902Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7600477Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7601100Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7601690Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7602276Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7602524Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7602877Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7603374Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7603869Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7604359Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7604815Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7605263Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7605737Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7606210Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7606680Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7607163Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7607620Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7608088Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7608597Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7609327Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7609997Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7610354Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7611023Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7611580Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7611950Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7612372Z [rank1]:E1204 12:47:38.794000 497131 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7612620Z dist init r=1, world=4
2025-12-04T12:52:45.7612832Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7613179Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7613672Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7614156Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7614644Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7615099Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7615545Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7616016Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7616485Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7617000Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7617473Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7617934Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7618434Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7618905Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7619618Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7620312Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7620672Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7621316Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7621871Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7622243Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7622667Z [rank0]:E1204 12:47:38.899000 497130 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7622917Z dist init r=0, world=4
2025-12-04T12:52:45.7623130Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7623476Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7623968Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7624459Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7624942Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7625393Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7625840Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7626322Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7626791Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7627261Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7627750Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7628247Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7628708Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7629193Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7629917Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7630583Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7630938Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7631578Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7632134Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7632505Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7632923Z [rank2]:E1204 12:47:38.998000 497132 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7633171Z dist init r=2, world=4
2025-12-04T12:52:45.7633381Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7633724Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7634223Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7634708Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7649946Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7650497Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7650941Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7651413Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7651895Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7652354Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7652815Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7653263Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7653744Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7654206Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7654921Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7655585Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7655933Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7656566Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7657117Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7657481Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7657892Z [rank3]:E1204 12:47:39.037000 497133 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7658138Z dist init r=3, world=4
2025-12-04T12:52:45.7658589Z [rank0]:[W1204 12:47:39.764541734 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7658997Z FAILED [9.1139s] [ 11%]
2025-12-04T12:52:45.7659062Z 
2025-12-04T12:52:45.7659122Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7659353Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda _
2025-12-04T12:52:45.7659569Z Traceback (most recent call last):
2025-12-04T12:52:45.7659831Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7660073Z     self._join_processes(fn)
2025-12-04T12:52:45.7660320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7660583Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7660849Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7661108Z     raise RuntimeError(error)
2025-12-04T12:52:45.7661290Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7661449Z Traceback (most recent call last):
2025-12-04T12:52:45.7661686Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7661927Z     getattr(self, test_name)()
2025-12-04T12:52:45.7662155Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7662406Z     fn()
2025-12-04T12:52:45.7662604Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7662850Z     method(*args, **kwargs)
2025-12-04T12:52:45.7663070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7663296Z     method(*args, **kwargs)
2025-12-04T12:52:45.7663514Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7663738Z     with policy():
2025-12-04T12:52:45.7663947Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7664176Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7664639Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7665065Z 
2025-12-04T12:52:45.7665140Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7665526Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7665835Z 
2025-12-04T12:52:45.7665924Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7666050Z 
2025-12-04T12:52:45.7666052Z 
2025-12-04T12:52:45.7666133Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7666333Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7666691Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-32fc4cd2f4792970.xml -
2025-12-04T12:52:45.7667021Z =========================== short test summary info ============================
2025-12-04T12:52:45.7667405Z FAILED [9.1139s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7667769Z Traceback (most recent call last):
2025-12-04T12:52:45.7668012Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7668298Z     getattr(self, test_name)()
2025-12-04T12:52:45.7668551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7668782Z     fn()
2025-12-04T12:52:45.7668982Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7669212Z     method(*args, **kwargs)
2025-12-04T12:52:45.7669429Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7669653Z     method(*args, **kwargs)
2025-12-04T12:52:45.7669883Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7670107Z     with policy():
2025-12-04T12:52:45.7670314Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7670539Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7670998Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2464153600 and is now 3307208704.
2025-12-04T12:52:45.7671448Z 
2025-12-04T12:52:45.7671526Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7671909Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7672214Z 
2025-12-04T12:52:45.7672305Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7672491Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7672654Z ======================= 1 failed, 1 deselected in 9.12s ========================
2025-12-04T12:52:45.7672794Z Got exit code 1
2025-12-04T12:52:45.7672890Z Retrying single test...
2025-12-04T12:52:45.7673142Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-04466835009a9b1b.xml
2025-12-04T12:52:45.7673428Z ============================= test session starts ==============================
2025-12-04T12:52:45.7673638Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7673822Z cachedir: .pytest_cache
2025-12-04T12:52:45.7674046Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7674282Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7674399Z configfile: pytest.ini
2025-12-04T12:52:45.7674625Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7674895Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7675268Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7675614Z Running 1 items in this shard
2025-12-04T12:52:45.7675685Z 
2025-12-04T12:52:45.7676030Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda I1204 12:47:43.388000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 497532
2025-12-04T12:52:45.7676562Z I1204 12:47:43.389000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 497533
2025-12-04T12:52:45.7676901Z I1204 12:47:43.390000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 497534
2025-12-04T12:52:45.7677251Z I1204 12:47:43.391000 497463 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 497535
2025-12-04T12:52:45.7677798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7678291Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7678887Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7679470Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7679925Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7680397Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7680964Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7681544Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7681993Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7682424Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7682986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7683563Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7684009Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7684437Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7685004Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7685582Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7685819Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7686160Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7686663Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7687138Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7687615Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7688062Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7688552Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7689014Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7689476Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7689965Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7690427Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7690874Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7691324Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7691785Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7692493Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7693153Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7693500Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7694134Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7694686Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7695049Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7695461Z [rank1]:E1204 12:47:50.648000 497533 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7695699Z dist init r=1, world=4
2025-12-04T12:52:45.7695926Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7696260Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7696744Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7697219Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7697707Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7698203Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7698639Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7699129Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7699589Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7700047Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7700510Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7700957Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7701409Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7701866Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7702566Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7703229Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7703577Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7704207Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7704751Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7705128Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7705540Z [rank2]:E1204 12:47:50.699000 497534 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7705779Z dist init r=2, world=4
2025-12-04T12:52:45.7705977Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7706309Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7706805Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7707280Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7707751Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7708254Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7708691Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7709152Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7709614Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7710072Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7710531Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7710978Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7711430Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7711890Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7712587Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7713247Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7713596Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7714244Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7714788Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7715152Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7715563Z [rank0]:E1204 12:47:50.814000 497532 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7715802Z dist init r=0, world=4
2025-12-04T12:52:45.7716019Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7716355Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7716838Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7717345Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7717822Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7718300Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7718737Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7719195Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7719654Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7720116Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7720575Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7721022Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7721476Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7721938Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7722645Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7723302Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7723659Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7724288Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7724837Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7725208Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7725619Z [rank3]:E1204 12:47:50.839000 497535 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7725859Z dist init r=3, world=4
2025-12-04T12:52:45.7726260Z [rank0]:[W1204 12:47:51.743317954 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7726698Z FAILED [9.1153s] [100%]
2025-12-04T12:52:45.7726761Z 
2025-12-04T12:52:45.7726822Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7727053Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda _
2025-12-04T12:52:45.7727273Z Traceback (most recent call last):
2025-12-04T12:52:45.7727515Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7727758Z     self._join_processes(fn)
2025-12-04T12:52:45.7728002Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7728291Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7728559Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7728820Z     raise RuntimeError(error)
2025-12-04T12:52:45.7728974Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7729134Z Traceback (most recent call last):
2025-12-04T12:52:45.7729373Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7729614Z     getattr(self, test_name)()
2025-12-04T12:52:45.7729843Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7730072Z     fn()
2025-12-04T12:52:45.7730274Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7730502Z     method(*args, **kwargs)
2025-12-04T12:52:45.7730723Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7730952Z     method(*args, **kwargs)
2025-12-04T12:52:45.7731167Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7731390Z     with policy():
2025-12-04T12:52:45.7731602Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7731832Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7732307Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7732732Z 
2025-12-04T12:52:45.7732806Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7733189Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7733499Z 
2025-12-04T12:52:45.7733590Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7733715Z 
2025-12-04T12:52:45.7733717Z 
2025-12-04T12:52:45.7733808Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7734011Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7734367Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-04466835009a9b1b.xml -
2025-12-04T12:52:45.7734693Z =========================== short test summary info ============================
2025-12-04T12:52:45.7735097Z FAILED [9.1153s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7735474Z Traceback (most recent call last):
2025-12-04T12:52:45.7735717Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7735955Z     getattr(self, test_name)()
2025-12-04T12:52:45.7736189Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7736417Z     fn()
2025-12-04T12:52:45.7736616Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7736842Z     method(*args, **kwargs)
2025-12-04T12:52:45.7737057Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7737284Z     method(*args, **kwargs)
2025-12-04T12:52:45.7737503Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7737724Z     with policy():
2025-12-04T12:52:45.7737935Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7738205Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7738667Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7739090Z 
2025-12-04T12:52:45.7739164Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7739550Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7739855Z 
2025-12-04T12:52:45.7739944Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7740129Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7740291Z ======================= 1 failed, 9 deselected in 9.13s ========================
2025-12-04T12:52:45.7740427Z Got exit code 1
2025-12-04T12:52:45.7740521Z Retrying single test...
2025-12-04T12:52:45.7740773Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-1a7d0527355a3ac0.xml
2025-12-04T12:52:45.7741067Z ============================= test session starts ==============================
2025-12-04T12:52:45.7741277Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7741466Z cachedir: .pytest_cache
2025-12-04T12:52:45.7741687Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7741925Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7742041Z configfile: pytest.ini
2025-12-04T12:52:45.7742280Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7742550Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7742924Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7743261Z Running 1 items in this shard
2025-12-04T12:52:45.7743332Z 
2025-12-04T12:52:45.7743677Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda I1204 12:47:55.370000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 497934
2025-12-04T12:52:45.7744253Z I1204 12:47:55.371000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 497935
2025-12-04T12:52:45.7744595Z I1204 12:47:55.372000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 497936
2025-12-04T12:52:45.7744938Z I1204 12:47:55.372000 497865 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 497937
2025-12-04T12:52:45.7745490Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7745930Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7746507Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7747093Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7747542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7747976Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7748587Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7749173Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7749617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7750046Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7750485Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7750915Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7751491Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7752070Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7752652Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7753243Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7753495Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7753832Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7754321Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7754799Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7755274Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7755719Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7756156Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7756617Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7757082Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7757539Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7758004Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7758533Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7758984Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7759459Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7760168Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7760831Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7761193Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7761825Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7762375Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7762771Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7763183Z [rank3]:E1204 12:48:02.710000 497937 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7763421Z dist init r=3, world=4
2025-12-04T12:52:45.7763624Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7763956Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7764440Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7764916Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7765392Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7765839Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7766277Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7766739Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7767204Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7767663Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7768123Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7768606Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7769075Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7769538Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7770251Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704.
2025-12-04T12:52:45.7770912Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7771260Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7771902Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7772462Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7772822Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7773234Z [rank0]:E1204 12:48:02.807000 497934 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7773474Z dist init r=0, world=4
2025-12-04T12:52:45.7773678Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7774013Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7774493Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7774967Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7775440Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7775881Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7776158Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7776306Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7776584Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7776731Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7777024Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7777162Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7777441Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7777598Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7778113Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7778277Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7778486Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7778885Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7778997Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7779216Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7779387Z [rank1]:E1204 12:48:02.812000 497935 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7779428Z dist init r=1, world=4
2025-12-04T12:52:45.7779568Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7779728Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7780016Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7780171Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7780457Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7780583Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7780864Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7781012Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7781300Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7781450Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7781726Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7781876Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7782154Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7782308Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7782827Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7782965Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7783163Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7783562Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7783678Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7783892Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7784058Z [rank2]:E1204 12:48:02.852000 497936 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7784098Z dist init r=2, world=4
2025-12-04T12:52:45.7784437Z [rank0]:[W1204 12:48:02.634762525 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7784479Z FAILED [9.2148s] [100%]
2025-12-04T12:52:45.7784482Z 
2025-12-04T12:52:45.7784541Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7784682Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda _
2025-12-04T12:52:45.7784730Z Traceback (most recent call last):
2025-12-04T12:52:45.7784897Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7784940Z     self._join_processes(fn)
2025-12-04T12:52:45.7785115Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7785170Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7785361Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7785405Z     raise RuntimeError(error)
2025-12-04T12:52:45.7785488Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7785534Z Traceback (most recent call last):
2025-12-04T12:52:45.7785697Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7785740Z     getattr(self, test_name)()
2025-12-04T12:52:45.7785897Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7785935Z     fn()
2025-12-04T12:52:45.7786102Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7786146Z     method(*args, **kwargs)
2025-12-04T12:52:45.7786298Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7786340Z     method(*args, **kwargs)
2025-12-04T12:52:45.7786490Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7786551Z     with policy():
2025-12-04T12:52:45.7786704Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7786746Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7787141Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704.
2025-12-04T12:52:45.7787144Z 
2025-12-04T12:52:45.7787221Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7787496Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7787500Z 
2025-12-04T12:52:45.7787588Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7787591Z 
2025-12-04T12:52:45.7787653Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7787697Z Traceback (most recent call last):
2025-12-04T12:52:45.7787861Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7787904Z     getattr(self, test_name)()
2025-12-04T12:52:45.7788066Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7788101Z     fn()
2025-12-04T12:52:45.7788294Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7788334Z     method(*args, **kwargs)
2025-12-04T12:52:45.7788486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7788527Z     method(*args, **kwargs)
2025-12-04T12:52:45.7788679Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7788716Z     with policy():
2025-12-04T12:52:45.7788870Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7788912Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7789318Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7789320Z 
2025-12-04T12:52:45.7789395Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7789670Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7789673Z 
2025-12-04T12:52:45.7789764Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7789766Z 
2025-12-04T12:52:45.7789768Z 
2025-12-04T12:52:45.7789845Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7789948Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7790180Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-1a7d0527355a3ac0.xml -
2025-12-04T12:52:45.7790243Z =========================== short test summary info ============================
2025-12-04T12:52:45.7790527Z FAILED [9.2148s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.7790605Z Traceback (most recent call last):
2025-12-04T12:52:45.7790769Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7790815Z     getattr(self, test_name)()
2025-12-04T12:52:45.7790981Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7791015Z     fn()
2025-12-04T12:52:45.7791170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7791210Z     method(*args, **kwargs)
2025-12-04T12:52:45.7791363Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7791404Z     method(*args, **kwargs)
2025-12-04T12:52:45.7791556Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7791594Z     with policy():
2025-12-04T12:52:45.7791749Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7791790Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7792183Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704.
2025-12-04T12:52:45.7792186Z 
2025-12-04T12:52:45.7792260Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7792537Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7792541Z 
2025-12-04T12:52:45.7792628Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7792633Z 
2025-12-04T12:52:45.7792694Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7792742Z Traceback (most recent call last):
2025-12-04T12:52:45.7792904Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7792949Z     getattr(self, test_name)()
2025-12-04T12:52:45.7793107Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7793155Z     fn()
2025-12-04T12:52:45.7793306Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7793349Z     method(*args, **kwargs)
2025-12-04T12:52:45.7793500Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7793541Z     method(*args, **kwargs)
2025-12-04T12:52:45.7793689Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7793729Z     with policy():
2025-12-04T12:52:45.7793889Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7793933Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7794320Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7794332Z 
2025-12-04T12:52:45.7794408Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7794690Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7794692Z 
2025-12-04T12:52:45.7794779Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7794848Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7794911Z ======================= 1 failed, 9 deselected in 9.22s ========================
2025-12-04T12:52:45.7794951Z Got exit code 1
2025-12-04T12:52:45.7795173Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.7795305Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.7795496Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-3614e4f517affde8.xml
2025-12-04T12:52:45.7795557Z ============================= test session starts ==============================
2025-12-04T12:52:45.7795670Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7795712Z cachedir: .pytest_cache
2025-12-04T12:52:45.7795869Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7795918Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7795958Z configfile: pytest.ini
2025-12-04T12:52:45.7796123Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7796196Z collecting ... collected 10 items / 2 deselected / 8 selected
2025-12-04T12:52:45.7796252Z stepcurrent: skipping 2 already run items.
2025-12-04T12:52:45.7796297Z Running 8 items in this shard
2025-12-04T12:52:45.7796299Z 
2025-12-04T12:52:45.7796645Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda I1204 12:48:07.244000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 498336
2025-12-04T12:52:45.7796801Z I1204 12:48:07.245000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 498337
2025-12-04T12:52:45.7796953Z I1204 12:48:07.246000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 498338
2025-12-04T12:52:45.7797115Z I1204 12:48:07.246000 498267 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 498339
2025-12-04T12:52:45.7797475Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7797529Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7798031Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7798097Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7798496Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7798571Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7799061Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7799122Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7799477Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7799524Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7800012Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7800075Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7800427Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7800474Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7800962Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7801023Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7801166Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7801329Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7801640Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7801794Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7802083Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7802209Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7802500Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7802651Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7802930Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7803099Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7803374Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7803512Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7803790Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7803939Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7804457Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7804574Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7804771Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7805170Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7805292Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7805507Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7805674Z [rank3]:E1204 12:48:14.628000 498339 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7805712Z dist init r=3, world=4
2025-12-04T12:52:45.7805862Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7806022Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7806314Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7806468Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7806762Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7806891Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7807167Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7807342Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7807619Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7807770Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7808050Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7808228Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7808509Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7808658Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7809177Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7809294Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7809489Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7809887Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7810001Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7810229Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7810393Z [rank2]:E1204 12:48:14.752000 498338 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7810434Z dist init r=2, world=4
2025-12-04T12:52:45.7810573Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7810735Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7811035Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7811189Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7811475Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7811621Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7811900Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7812048Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7812325Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7812471Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7812746Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7812886Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7813163Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7813311Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7813825Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7813943Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7814141Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7814545Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7814659Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7814870Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7815037Z [rank1]:E1204 12:48:14.817000 498337 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7815075Z dist init r=1, world=4
2025-12-04T12:52:45.7815225Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7815383Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7815670Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7815837Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7816136Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7816261Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7816536Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7816685Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7816960Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7817110Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7817387Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7817524Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7817804Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7817952Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7818500Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7818613Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7818820Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7819219Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7819332Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7819557Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7819721Z [rank0]:E1204 12:48:14.905000 498336 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7819763Z dist init r=0, world=4
2025-12-04T12:52:45.7820100Z [rank0]:[W1204 12:48:15.893943547 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7820165Z FAILED [9.2145s] [ 12%]
2025-12-04T12:52:45.7820168Z 
2025-12-04T12:52:45.7820227Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7820362Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda _
2025-12-04T12:52:45.7820411Z Traceback (most recent call last):
2025-12-04T12:52:45.7820574Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7820620Z     self._join_processes(fn)
2025-12-04T12:52:45.7820792Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7820850Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7821030Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7821076Z     raise RuntimeError(error)
2025-12-04T12:52:45.7821158Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7821207Z Traceback (most recent call last):
2025-12-04T12:52:45.7821368Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7821414Z     getattr(self, test_name)()
2025-12-04T12:52:45.7821573Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7821612Z     fn()
2025-12-04T12:52:45.7821763Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7821806Z     method(*args, **kwargs)
2025-12-04T12:52:45.7821958Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7822000Z     method(*args, **kwargs)
2025-12-04T12:52:45.7822150Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7822190Z     with policy():
2025-12-04T12:52:45.7822341Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7822386Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7822784Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7822787Z 
2025-12-04T12:52:45.7822861Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7823133Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7823137Z 
2025-12-04T12:52:45.7823225Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7823227Z 
2025-12-04T12:52:45.7823229Z 
2025-12-04T12:52:45.7823317Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7823405Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7823643Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-3614e4f517affde8.xml -
2025-12-04T12:52:45.7823704Z =========================== short test summary info ============================
2025-12-04T12:52:45.7823996Z FAILED [9.2145s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7824055Z Traceback (most recent call last):
2025-12-04T12:52:45.7824219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7824265Z     getattr(self, test_name)()
2025-12-04T12:52:45.7824426Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7824465Z     fn()
2025-12-04T12:52:45.7824617Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7824660Z     method(*args, **kwargs)
2025-12-04T12:52:45.7824811Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7824854Z     method(*args, **kwargs)
2025-12-04T12:52:45.7825006Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7825045Z     with policy():
2025-12-04T12:52:45.7825196Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7825240Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7825629Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7825631Z 
2025-12-04T12:52:45.7825708Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7825981Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7825984Z 
2025-12-04T12:52:45.7826072Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7826137Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7826200Z ======================= 1 failed, 2 deselected in 9.22s ========================
2025-12-04T12:52:45.7826240Z Got exit code 1
2025-12-04T12:52:45.7826280Z Retrying single test...
2025-12-04T12:52:45.7826470Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-49798aeb29a97079.xml
2025-12-04T12:52:45.7826540Z ============================= test session starts ==============================
2025-12-04T12:52:45.7826656Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7826699Z cachedir: .pytest_cache
2025-12-04T12:52:45.7826860Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7826906Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7826949Z configfile: pytest.ini
2025-12-04T12:52:45.7827112Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7827198Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7827461Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7827509Z Running 1 items in this shard
2025-12-04T12:52:45.7827511Z 
2025-12-04T12:52:45.7827856Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda I1204 12:48:19.254000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 498738
2025-12-04T12:52:45.7828034Z I1204 12:48:19.255000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 498739
2025-12-04T12:52:45.7828223Z I1204 12:48:19.256000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 498740
2025-12-04T12:52:45.7828375Z I1204 12:48:19.256000 498669 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 498741
2025-12-04T12:52:45.7828739Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7828788Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7829284Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7829349Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7829706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7829754Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7830242Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7830306Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7830657Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7830706Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7831207Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7831269Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7831634Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7831679Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7832169Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7832249Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7832406Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7832568Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7832858Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7833015Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7833301Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7833429Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7833705Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7833857Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7834135Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7834288Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7834571Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7834712Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7834993Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7835153Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7835672Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7835794Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7836002Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7836409Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7836524Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7836752Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7836930Z [rank2]:E1204 12:48:26.617000 498740 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7836975Z dist init r=2, world=4
2025-12-04T12:52:45.7837115Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7837279Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7837572Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7837728Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7838019Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7838173Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7838458Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7838607Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7838889Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7839041Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7839316Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7839455Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7839748Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7839901Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7840428Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7840546Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7840746Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7841148Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7841291Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7841503Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7841670Z [rank3]:E1204 12:48:26.751000 498741 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7841708Z dist init r=3, world=4
2025-12-04T12:52:45.7841848Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7842009Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7842299Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7842455Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7842738Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7842865Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7843144Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7843298Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7843575Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7843726Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7844016Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7844154Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7844436Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7844597Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7845112Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7845239Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7845444Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7845846Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7845960Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7846175Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7846340Z [rank0]:E1204 12:48:26.867000 498738 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7846383Z dist init r=0, world=4
2025-12-04T12:52:45.7846521Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7846683Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7846972Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7847126Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7847412Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7847538Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7847818Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7847967Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7848379Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7848527Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7848805Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7848943Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7849238Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7849391Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7849903Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7850045Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7850245Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7850647Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7850765Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7850977Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7851144Z [rank1]:E1204 12:48:26.903000 498739 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7851185Z dist init r=1, world=4
2025-12-04T12:52:45.7851527Z [rank0]:[W1204 12:48:27.803446166 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7851568Z FAILED [9.3150s] [100%]
2025-12-04T12:52:45.7851572Z 
2025-12-04T12:52:45.7851629Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7851768Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda _
2025-12-04T12:52:45.7851815Z Traceback (most recent call last):
2025-12-04T12:52:45.7851980Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7852025Z     self._join_processes(fn)
2025-12-04T12:52:45.7852200Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7852255Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7852446Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7852491Z     raise RuntimeError(error)
2025-12-04T12:52:45.7852574Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7852621Z Traceback (most recent call last):
2025-12-04T12:52:45.7852787Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7852830Z     getattr(self, test_name)()
2025-12-04T12:52:45.7852991Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7853024Z     fn()
2025-12-04T12:52:45.7853187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7853229Z     method(*args, **kwargs)
2025-12-04T12:52:45.7853383Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7853424Z     method(*args, **kwargs)
2025-12-04T12:52:45.7853575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7853622Z     with policy():
2025-12-04T12:52:45.7853787Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7853827Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7854217Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7854220Z 
2025-12-04T12:52:45.7854296Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7854566Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7854569Z 
2025-12-04T12:52:45.7854659Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7854663Z 
2025-12-04T12:52:45.7854664Z 
2025-12-04T12:52:45.7854740Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7854828Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7855064Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-49798aeb29a97079.xml -
2025-12-04T12:52:45.7855127Z =========================== short test summary info ============================
2025-12-04T12:52:45.7855412Z FAILED [9.3150s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7855458Z Traceback (most recent call last):
2025-12-04T12:52:45.7855622Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7855665Z     getattr(self, test_name)()
2025-12-04T12:52:45.7855827Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7855862Z     fn()
2025-12-04T12:52:45.7856017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7856056Z     method(*args, **kwargs)
2025-12-04T12:52:45.7856208Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7856248Z     method(*args, **kwargs)
2025-12-04T12:52:45.7856411Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7856449Z     with policy():
2025-12-04T12:52:45.7856605Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7856647Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7857046Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7857048Z 
2025-12-04T12:52:45.7857122Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7857397Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7857399Z 
2025-12-04T12:52:45.7857490Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7857565Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7857644Z ======================= 1 failed, 9 deselected in 9.33s ========================
2025-12-04T12:52:45.7857683Z Got exit code 1
2025-12-04T12:52:45.7857728Z Retrying single test...
2025-12-04T12:52:45.7857917Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-5e212175c44e621e.xml
2025-12-04T12:52:45.7857978Z ============================= test session starts ==============================
2025-12-04T12:52:45.7858090Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7858134Z cachedir: .pytest_cache
2025-12-04T12:52:45.7858330Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7858379Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7858422Z configfile: pytest.ini
2025-12-04T12:52:45.7858589Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7858664Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7858930Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7858974Z Running 1 items in this shard
2025-12-04T12:52:45.7858979Z 
2025-12-04T12:52:45.7859321Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda I1204 12:48:31.253000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 499140
2025-12-04T12:52:45.7859478Z I1204 12:48:31.254000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 499141
2025-12-04T12:52:45.7859633Z I1204 12:48:31.255000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 499142
2025-12-04T12:52:45.7859786Z I1204 12:48:31.255000 499071 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 499143
2025-12-04T12:52:45.7860148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7860200Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7860707Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7860774Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7861143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7861191Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7861681Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7861754Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7862124Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7862171Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7862659Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7862721Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7863071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7863124Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7863610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7863673Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7863817Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7863983Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7864277Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7864434Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7864727Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7864873Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7865154Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7865304Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7865594Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7865742Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7866018Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7866166Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7866453Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7866606Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7867123Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7867241Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7867440Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7867842Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7867960Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7868221Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7868388Z [rank2]:E1204 12:48:38.688000 499142 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7868429Z dist init r=2, world=4
2025-12-04T12:52:45.7868569Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7868729Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7869019Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7869188Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7869475Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7869604Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7869892Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7870042Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7870318Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7870468Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7870776Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7870914Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7871192Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7871340Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7871855Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7871976Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7872172Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7872568Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7872682Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7872895Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7873059Z [rank0]:E1204 12:48:38.804000 499140 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7873105Z dist init r=0, world=4
2025-12-04T12:52:45.7873244Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7873416Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7873703Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7873859Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7874144Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7874276Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7874553Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7874699Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7874993Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7875140Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7875414Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7875554Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7875831Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7875984Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7876497Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7876615Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7876813Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7877210Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7877328Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7877539Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7877715Z [rank1]:E1204 12:48:38.901000 499141 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7877752Z dist init r=1, world=4
2025-12-04T12:52:45.7877896Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7878059Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7878384Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7878552Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7878837Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7878960Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7879249Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7879417Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7879692Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7879841Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7880118Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7880258Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7880537Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7880687Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7881201Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7881322Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7881517Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7881914Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7882040Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7882251Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7882415Z [rank3]:E1204 12:48:38.922000 499143 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7882456Z dist init r=3, world=4
2025-12-04T12:52:45.7882803Z [rank0]:[W1204 12:48:39.732262120 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7882845Z FAILED [9.3159s] [100%]
2025-12-04T12:52:45.7882847Z 
2025-12-04T12:52:45.7882903Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7883036Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda _
2025-12-04T12:52:45.7883082Z Traceback (most recent call last):
2025-12-04T12:52:45.7883244Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7883308Z     self._join_processes(fn)
2025-12-04T12:52:45.7883481Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7883538Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7883718Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7883765Z     raise RuntimeError(error)
2025-12-04T12:52:45.7883847Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7883896Z Traceback (most recent call last):
2025-12-04T12:52:45.7884060Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7884105Z     getattr(self, test_name)()
2025-12-04T12:52:45.7884265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7884302Z     fn()
2025-12-04T12:52:45.7884453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7884498Z     method(*args, **kwargs)
2025-12-04T12:52:45.7884650Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7884694Z     method(*args, **kwargs)
2025-12-04T12:52:45.7884847Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7884888Z     with policy():
2025-12-04T12:52:45.7885042Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7885086Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7885478Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7885482Z 
2025-12-04T12:52:45.7885557Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7885832Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7885834Z 
2025-12-04T12:52:45.7885923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7885937Z 
2025-12-04T12:52:45.7885939Z 
2025-12-04T12:52:45.7886019Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7886108Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7886343Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-5e212175c44e621e.xml -
2025-12-04T12:52:45.7886403Z =========================== short test summary info ============================
2025-12-04T12:52:45.7886698Z FAILED [9.3159s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7886749Z Traceback (most recent call last):
2025-12-04T12:52:45.7886917Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7886963Z     getattr(self, test_name)()
2025-12-04T12:52:45.7887123Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7887179Z     fn()
2025-12-04T12:52:45.7887331Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7887375Z     method(*args, **kwargs)
2025-12-04T12:52:45.7887526Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7887571Z     method(*args, **kwargs)
2025-12-04T12:52:45.7887723Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7887764Z     with policy():
2025-12-04T12:52:45.7887918Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7887961Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7888399Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7888403Z 
2025-12-04T12:52:45.7888482Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7888757Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7888759Z 
2025-12-04T12:52:45.7888847Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7888915Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7888978Z ======================= 1 failed, 9 deselected in 9.33s ========================
2025-12-04T12:52:45.7889018Z Got exit code 1
2025-12-04T12:52:45.7889238Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.7889371Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.7889560Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-ee7066dd84237162.xml
2025-12-04T12:52:45.7889622Z ============================= test session starts ==============================
2025-12-04T12:52:45.7889733Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7889778Z cachedir: .pytest_cache
2025-12-04T12:52:45.7889955Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7890004Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7890045Z configfile: pytest.ini
2025-12-04T12:52:45.7890210Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7890284Z collecting ... collected 10 items / 3 deselected / 7 selected
2025-12-04T12:52:45.7890339Z stepcurrent: skipping 3 already run items.
2025-12-04T12:52:45.7890384Z Running 7 items in this shard
2025-12-04T12:52:45.7890388Z 
2025-12-04T12:52:45.7890745Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda I1204 12:48:43.271000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 499542
2025-12-04T12:52:45.7890905Z I1204 12:48:43.272000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 499543
2025-12-04T12:52:45.7891057Z I1204 12:48:43.273000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 499544
2025-12-04T12:52:45.7891225Z I1204 12:48:43.273000 499473 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 499545
2025-12-04T12:52:45.7891597Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7891649Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7892143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7896293Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7896660Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7896709Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7897206Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7897267Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7897622Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7897670Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7898195Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7898256Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7898638Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7898687Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7899190Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7899249Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7899393Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7899558Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7899851Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7900044Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7900335Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7900459Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7900741Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7900890Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7901171Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7901320Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7901595Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7901733Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7902010Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7902160Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7902678Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7902804Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7903000Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7903401Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7903526Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7903740Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7903906Z [rank3]:E1204 12:48:50.529000 499545 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7903945Z dist init r=3, world=4
2025-12-04T12:52:45.7904093Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7904264Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7904551Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7904704Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7904987Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7905112Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7905389Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7905536Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7905812Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7905961Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7906237Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7906373Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7906651Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7906801Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7907323Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7907439Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7907635Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7908041Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7908194Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7908407Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7908596Z [rank1]:E1204 12:48:50.669000 499543 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7908635Z dist init r=1, world=4
2025-12-04T12:52:45.7908771Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7908933Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7909219Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7909373Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7909659Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7909783Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7910059Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7910207Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7910481Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7910628Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7910904Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7911039Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7911324Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7911473Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7912000Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2462056448 and is now 3307208704.
2025-12-04T12:52:45.7912115Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7912312Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7912707Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7912840Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7913051Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7913216Z [rank0]:E1204 12:48:50.679000 499542 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7913253Z dist init r=0, world=4
2025-12-04T12:52:45.7913391Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7913550Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7913836Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7913992Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7914274Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7914399Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7914673Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7914823Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7915099Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7915247Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7915532Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7915668Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7915945Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7916093Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7916619Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7916733Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7916940Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7917350Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7917463Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7917673Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7917836Z [rank2]:E1204 12:48:50.721000 499544 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7917875Z dist init r=2, world=4
2025-12-04T12:52:45.7918247Z [rank0]:[W1204 12:48:50.527016978 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7918288Z FAILED [9.1147s] [ 14%]
2025-12-04T12:52:45.7918290Z 
2025-12-04T12:52:45.7918349Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7918483Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda _
2025-12-04T12:52:45.7918530Z Traceback (most recent call last):
2025-12-04T12:52:45.7918694Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7918738Z     self._join_processes(fn)
2025-12-04T12:52:45.7918911Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7918967Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7919144Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7919188Z     raise RuntimeError(error)
2025-12-04T12:52:45.7919271Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7919316Z Traceback (most recent call last):
2025-12-04T12:52:45.7919478Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7919521Z     getattr(self, test_name)()
2025-12-04T12:52:45.7919698Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7919734Z     fn()
2025-12-04T12:52:45.7919885Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7919928Z     method(*args, **kwargs)
2025-12-04T12:52:45.7920077Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7920118Z     method(*args, **kwargs)
2025-12-04T12:52:45.7920281Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7920321Z     with policy():
2025-12-04T12:52:45.7920472Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7920514Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7920903Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7920935Z 
2025-12-04T12:52:45.7921009Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7921280Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7921284Z 
2025-12-04T12:52:45.7921372Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7921374Z 
2025-12-04T12:52:45.7921376Z 
2025-12-04T12:52:45.7921453Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7921542Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7921778Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-ee7066dd84237162.xml -
2025-12-04T12:52:45.7921841Z =========================== short test summary info ============================
2025-12-04T12:52:45.7922128Z FAILED [9.1147s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7922175Z Traceback (most recent call last):
2025-12-04T12:52:45.7922340Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7922383Z     getattr(self, test_name)()
2025-12-04T12:52:45.7922542Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7922579Z     fn()
2025-12-04T12:52:45.7922730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7922775Z     method(*args, **kwargs)
2025-12-04T12:52:45.7922925Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7922965Z     method(*args, **kwargs)
2025-12-04T12:52:45.7923115Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7923153Z     with policy():
2025-12-04T12:52:45.7923304Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7923346Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7924688Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3097493504.
2025-12-04T12:52:45.7924693Z 
2025-12-04T12:52:45.7924768Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7925037Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7925040Z 
2025-12-04T12:52:45.7925137Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7925202Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7925264Z ======================= 1 failed, 3 deselected in 9.12s ========================
2025-12-04T12:52:45.7925302Z Got exit code 1
2025-12-04T12:52:45.7925343Z Retrying single test...
2025-12-04T12:52:45.7925531Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-036f49a76ee38524.xml
2025-12-04T12:52:45.7925602Z ============================= test session starts ==============================
2025-12-04T12:52:45.7925725Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7925766Z cachedir: .pytest_cache
2025-12-04T12:52:45.7925923Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7925970Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7926010Z configfile: pytest.ini
2025-12-04T12:52:45.7926173Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7926248Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7926509Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7926555Z Running 1 items in this shard
2025-12-04T12:52:45.7926557Z 
2025-12-04T12:52:45.7926900Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda I1204 12:48:55.120000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 499944
2025-12-04T12:52:45.7927057Z I1204 12:48:55.122000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 499945
2025-12-04T12:52:45.7927208Z I1204 12:48:55.122000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 499946
2025-12-04T12:52:45.7927358Z I1204 12:48:55.123000 499875 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 499947
2025-12-04T12:52:45.7927717Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7927767Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7928121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7928197Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7928706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7928772Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7929272Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7929333Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7929687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7929733Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7930230Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7930306Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7930658Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7930704Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7931192Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7931252Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7931396Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7931558Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7931850Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7932004Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7932290Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7932414Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7932690Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7932853Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7933129Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7933278Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7933555Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7933702Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7933980Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7934127Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7934660Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7934776Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7934972Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7935370Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7935485Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7935696Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7935861Z [rank1]:E1204 12:49:02.497000 499945 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7935900Z dist init r=1, world=4
2025-12-04T12:52:45.7936038Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7936197Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7936483Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7936636Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7936920Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7937044Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7937328Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7937478Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7937753Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7937912Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7938223Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7938359Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7938647Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7938811Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7939327Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7939443Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7939639Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7940037Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7940150Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7940361Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7940524Z [rank3]:E1204 12:49:02.532000 499947 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7940563Z dist init r=3, world=4
2025-12-04T12:52:45.7940702Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7940860Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7941147Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7941300Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7941597Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7941724Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7941999Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7942158Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7942433Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7942580Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7942863Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7943009Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7943286Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7943433Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7943948Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7944063Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7944261Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7944659Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7944772Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7944983Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7945145Z [rank2]:E1204 12:49:02.678000 499946 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7945184Z dist init r=2, world=4
2025-12-04T12:52:45.7945322Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7945481Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7945776Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7945932Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7946219Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7946353Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7946632Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7946778Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7947054Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7947225Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7947501Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7947636Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7947912Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7948062Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7948618Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7948733Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7948928Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7949324Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7949439Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7949650Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7949813Z [rank0]:E1204 12:49:02.799000 499944 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7949851Z dist init r=0, world=4
2025-12-04T12:52:45.7950205Z [rank0]:[W1204 12:49:03.807760409 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7950246Z FAILED [9.2132s] [100%]
2025-12-04T12:52:45.7950248Z 
2025-12-04T12:52:45.7950305Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7950440Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda _
2025-12-04T12:52:45.7950486Z Traceback (most recent call last):
2025-12-04T12:52:45.7950663Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7950706Z     self._join_processes(fn)
2025-12-04T12:52:45.7950881Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7950935Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7951114Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7951183Z     raise RuntimeError(error)
2025-12-04T12:52:45.7951265Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7951311Z Traceback (most recent call last):
2025-12-04T12:52:45.7951472Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7951514Z     getattr(self, test_name)()
2025-12-04T12:52:45.7951673Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7951707Z     fn()
2025-12-04T12:52:45.7951860Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7951900Z     method(*args, **kwargs)
2025-12-04T12:52:45.7952051Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7952094Z     method(*args, **kwargs)
2025-12-04T12:52:45.7952246Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7952282Z     with policy():
2025-12-04T12:52:45.7952435Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7952477Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7952869Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7952871Z 
2025-12-04T12:52:45.7952947Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7953220Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7953223Z 
2025-12-04T12:52:45.7953312Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7953314Z 
2025-12-04T12:52:45.7953373Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7953420Z Traceback (most recent call last):
2025-12-04T12:52:45.7953582Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7953625Z     getattr(self, test_name)()
2025-12-04T12:52:45.7953795Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7953830Z     fn()
2025-12-04T12:52:45.7953980Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7954022Z     method(*args, **kwargs)
2025-12-04T12:52:45.7954171Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7954211Z     method(*args, **kwargs)
2025-12-04T12:52:45.7954362Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7954409Z     with policy():
2025-12-04T12:52:45.7954559Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7954601Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7954989Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7955010Z 
2025-12-04T12:52:45.7955084Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7955353Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7955355Z 
2025-12-04T12:52:45.7955443Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7955445Z 
2025-12-04T12:52:45.7955447Z 
2025-12-04T12:52:45.7955523Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7955612Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7955847Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-036f49a76ee38524.xml -
2025-12-04T12:52:45.7955909Z =========================== short test summary info ============================
2025-12-04T12:52:45.7956192Z FAILED [9.2132s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7956240Z Traceback (most recent call last):
2025-12-04T12:52:45.7956405Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7956450Z     getattr(self, test_name)()
2025-12-04T12:52:45.7956608Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7956645Z     fn()
2025-12-04T12:52:45.7956797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7956840Z     method(*args, **kwargs)
2025-12-04T12:52:45.7956991Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7957034Z     method(*args, **kwargs)
2025-12-04T12:52:45.7957183Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7957221Z     with policy():
2025-12-04T12:52:45.7957372Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7957414Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7957813Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7957817Z 
2025-12-04T12:52:45.7957890Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7958211Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7958213Z 
2025-12-04T12:52:45.7958302Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7958304Z 
2025-12-04T12:52:45.7958380Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.7958426Z Traceback (most recent call last):
2025-12-04T12:52:45.7958587Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7958630Z     getattr(self, test_name)()
2025-12-04T12:52:45.7958790Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7958837Z     fn()
2025-12-04T12:52:45.7959002Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7959043Z     method(*args, **kwargs)
2025-12-04T12:52:45.7959196Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7959236Z     method(*args, **kwargs)
2025-12-04T12:52:45.7959387Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7959423Z     with policy():
2025-12-04T12:52:45.7959578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7959619Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7960006Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7960010Z 
2025-12-04T12:52:45.7960084Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7960355Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7960357Z 
2025-12-04T12:52:45.7960445Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7960509Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7960576Z ======================= 1 failed, 9 deselected in 9.22s ========================
2025-12-04T12:52:45.7960614Z Got exit code 1
2025-12-04T12:52:45.7960655Z Retrying single test...
2025-12-04T12:52:45.7960845Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e156b6b4e48e5ac7.xml
2025-12-04T12:52:45.7960905Z ============================= test session starts ==============================
2025-12-04T12:52:45.7961016Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7961059Z cachedir: .pytest_cache
2025-12-04T12:52:45.7961216Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7961263Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7961303Z configfile: pytest.ini
2025-12-04T12:52:45.7961488Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7961564Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.7961828Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7961876Z Running 1 items in this shard
2025-12-04T12:52:45.7961878Z 
2025-12-04T12:52:45.7962228Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda I1204 12:49:06.982000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 500346
2025-12-04T12:52:45.7962383Z I1204 12:49:06.983000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 500347
2025-12-04T12:52:45.7962536Z I1204 12:49:06.983000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 500348
2025-12-04T12:52:45.7962688Z I1204 12:49:06.984000 500277 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 500349
2025-12-04T12:52:45.7963063Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7963122Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7963617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7963680Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7964036Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7964085Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7964437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7964483Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7964973Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7965034Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7965520Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7965581Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7965943Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T12:52:45.7965990Z   self.encoder = TransformerEncoder(
2025-12-04T12:52:45.7966477Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7966536Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7966688Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7966852Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7967145Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7967315Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7967612Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7967738Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7968015Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7968212Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7968489Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7968639Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7968914Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7969054Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7969334Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7969487Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7970005Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2243952640 and is now 3097493504.
2025-12-04T12:52:45.7970121Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7970333Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7970732Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7970849Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7971075Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7971240Z [rank3]:E1204 12:49:14.415000 500349 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.7971281Z dist init r=3, world=4
2025-12-04T12:52:45.7971420Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7971592Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7971892Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7972046Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7972329Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7972455Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7972731Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7972880Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7973155Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7973300Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7973575Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7973711Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7973991Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7974141Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7974666Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7974783Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7974978Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7975388Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7975502Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7975715Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7975890Z [rank2]:E1204 12:49:14.445000 500348 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.7975939Z dist init r=2, world=4
2025-12-04T12:52:45.7976079Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7976238Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7976529Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7976682Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7976967Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7977092Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7977369Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7977517Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7977793Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7977939Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7978249Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7978385Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7978662Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7978827Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7979339Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7979455Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7979670Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7980066Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7980192Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7980416Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7980581Z [rank1]:E1204 12:49:14.481000 500347 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.7980622Z dist init r=1, world=4
2025-12-04T12:52:45.7980760Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.7980922Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.7981208Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7981367Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.7981651Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7981775Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.7982049Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7982199Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7982474Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7982620Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.7982895Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7983045Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.7983323Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7983475Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.7983996Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2459959296 and is now 3307208704.
2025-12-04T12:52:45.7984113Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7984307Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7984724Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7984839Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.7985050Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7985216Z [rank0]:E1204 12:49:14.660000 500346 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.7985253Z dist init r=0, world=4
2025-12-04T12:52:45.7985589Z [rank0]:[W1204 12:49:14.562640898 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.7985629Z FAILED [9.5161s] [100%]
2025-12-04T12:52:45.7985631Z 
2025-12-04T12:52:45.7985688Z =================================== FAILURES ===================================
2025-12-04T12:52:45.7985824Z _ TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda _
2025-12-04T12:52:45.7985870Z Traceback (most recent call last):
2025-12-04T12:52:45.7986032Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.7986077Z     self._join_processes(fn)
2025-12-04T12:52:45.7986250Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.7986308Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.7986486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.7986531Z     raise RuntimeError(error)
2025-12-04T12:52:45.7986611Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7986658Z Traceback (most recent call last):
2025-12-04T12:52:45.7986818Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7986864Z     getattr(self, test_name)()
2025-12-04T12:52:45.7987022Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7987067Z     fn()
2025-12-04T12:52:45.7987219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7987262Z     method(*args, **kwargs)
2025-12-04T12:52:45.7987417Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7987457Z     method(*args, **kwargs)
2025-12-04T12:52:45.7987606Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7987643Z     with policy():
2025-12-04T12:52:45.7987808Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7987850Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7988273Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7988290Z 
2025-12-04T12:52:45.7988364Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7988648Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7988650Z 
2025-12-04T12:52:45.7988738Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7988742Z 
2025-12-04T12:52:45.7988800Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7988846Z Traceback (most recent call last):
2025-12-04T12:52:45.7989009Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7989052Z     getattr(self, test_name)()
2025-12-04T12:52:45.7989209Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7989246Z     fn()
2025-12-04T12:52:45.7989397Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7989439Z     method(*args, **kwargs)
2025-12-04T12:52:45.7989588Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7989627Z     method(*args, **kwargs)
2025-12-04T12:52:45.7989777Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7989817Z     with policy():
2025-12-04T12:52:45.7989968Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7990008Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7990396Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7990400Z 
2025-12-04T12:52:45.7990474Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7990743Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7990746Z 
2025-12-04T12:52:45.7990833Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7990835Z 
2025-12-04T12:52:45.7990837Z 
2025-12-04T12:52:45.7990933Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.7991021Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.7991256Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e156b6b4e48e5ac7.xml -
2025-12-04T12:52:45.7991316Z =========================== short test summary info ============================
2025-12-04T12:52:45.7991599Z FAILED [9.5161s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.7991657Z Traceback (most recent call last):
2025-12-04T12:52:45.7991821Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7991863Z     getattr(self, test_name)()
2025-12-04T12:52:45.7992025Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7992058Z     fn()
2025-12-04T12:52:45.7992221Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7992270Z     method(*args, **kwargs)
2025-12-04T12:52:45.7992421Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7992460Z     method(*args, **kwargs)
2025-12-04T12:52:45.7992612Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7992649Z     with policy():
2025-12-04T12:52:45.7992801Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7992842Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7993228Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3164602368.
2025-12-04T12:52:45.7993232Z 
2025-12-04T12:52:45.7993305Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7993575Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7993577Z 
2025-12-04T12:52:45.7993666Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7993668Z 
2025-12-04T12:52:45.7993725Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.7993770Z Traceback (most recent call last):
2025-12-04T12:52:45.7993933Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.7993975Z     getattr(self, test_name)()
2025-12-04T12:52:45.7994135Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.7994169Z     fn()
2025-12-04T12:52:45.7994320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7994359Z     method(*args, **kwargs)
2025-12-04T12:52:45.7994511Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.7994551Z     method(*args, **kwargs)
2025-12-04T12:52:45.7994701Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.7994736Z     with policy():
2025-12-04T12:52:45.7994909Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.7994952Z     raise RuntimeError(msg)
2025-12-04T12:52:45.7995336Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3147825152.
2025-12-04T12:52:45.7995339Z 
2025-12-04T12:52:45.7995411Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.7995701Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7995703Z 
2025-12-04T12:52:45.7995792Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.7995857Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.7995921Z ======================= 1 failed, 9 deselected in 9.53s ========================
2025-12-04T12:52:45.7995979Z Got exit code 1
2025-12-04T12:52:45.7996200Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.7996328Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.7996520Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b0fce4bab8e79ff7.xml
2025-12-04T12:52:45.7996577Z ============================= test session starts ==============================
2025-12-04T12:52:45.7996691Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.7996735Z cachedir: .pytest_cache
2025-12-04T12:52:45.7996892Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.7996938Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.7996982Z configfile: pytest.ini
2025-12-04T12:52:45.7997143Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.7997218Z collecting ... collected 10 items / 4 deselected / 6 selected
2025-12-04T12:52:45.7997270Z stepcurrent: skipping 4 already run items.
2025-12-04T12:52:45.7997316Z Running 6 items in this shard
2025-12-04T12:52:45.7997318Z 
2025-12-04T12:52:45.7997662Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda I1204 12:49:19.064000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 500748
2025-12-04T12:52:45.7997818Z I1204 12:49:19.066000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 500749
2025-12-04T12:52:45.7997971Z I1204 12:49:19.066000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 500750
2025-12-04T12:52:45.7998121Z I1204 12:49:19.067000 500679 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 500751
2025-12-04T12:52:45.7998692Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7998755Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7999255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7999317Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.7999811Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.7999870Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8000358Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8000446Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8000590Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8000756Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8001048Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8001203Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8001492Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8001619Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8001900Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8002049Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8002326Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8002476Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8002753Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8002891Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8003185Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8003335Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8003850Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8003976Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8004172Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8004571Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8004710Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8004921Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8005089Z [rank1]:E1204 12:49:26.057000 500749 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8005128Z dist init r=1, world=4
2025-12-04T12:52:45.8005267Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8005426Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8005715Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8005871Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8006156Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8006279Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8006555Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8006706Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8006981Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8007131Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8007427Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8007563Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8007841Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8007990Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8008552Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8008667Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8008861Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8009288Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8009402Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8009616Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8009782Z [rank3]:E1204 12:49:26.136000 500751 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8009822Z dist init r=3, world=4
2025-12-04T12:52:45.8009960Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8010120Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8010406Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8010560Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8010844Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8010969Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8011244Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8011391Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8011667Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8011826Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8012102Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8012240Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8012525Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8012677Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8013188Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8013323Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8013520Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8013918Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8014034Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8014246Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8014412Z [rank2]:E1204 12:49:26.201000 500750 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8014451Z dist init r=2, world=4
2025-12-04T12:52:45.8014589Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8014749Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8015035Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8015191Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8015475Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8015599Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8015878Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8016040Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8016313Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8016463Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8016750Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8016886Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8017163Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8017310Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8017844Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8017959Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8018194Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8018593Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8018708Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8018921Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8019084Z [rank0]:E1204 12:49:26.226000 500748 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8019123Z dist init r=0, world=4
2025-12-04T12:52:45.8019459Z [rank0]:[W1204 12:49:26.147153513 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8019502Z FAILED [8.9164s] [ 16%]
2025-12-04T12:52:45.8019505Z 
2025-12-04T12:52:45.8019562Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8019696Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda _
2025-12-04T12:52:45.8019743Z Traceback (most recent call last):
2025-12-04T12:52:45.8019906Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8019950Z     self._join_processes(fn)
2025-12-04T12:52:45.8020123Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8020194Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8020372Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8020417Z     raise RuntimeError(error)
2025-12-04T12:52:45.8020498Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8020544Z Traceback (most recent call last):
2025-12-04T12:52:45.8020704Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8020747Z     getattr(self, test_name)()
2025-12-04T12:52:45.8020922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8020960Z     fn()
2025-12-04T12:52:45.8021111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8021153Z     method(*args, **kwargs)
2025-12-04T12:52:45.8021302Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8021356Z     method(*args, **kwargs)
2025-12-04T12:52:45.8021517Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8021554Z     with policy():
2025-12-04T12:52:45.8021704Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8021746Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8022133Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8022138Z 
2025-12-04T12:52:45.8022214Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8022485Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8022489Z 
2025-12-04T12:52:45.8022578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8022580Z 
2025-12-04T12:52:45.8022582Z 
2025-12-04T12:52:45.8022657Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8022745Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8022982Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b0fce4bab8e79ff7.xml -
2025-12-04T12:52:45.8023043Z =========================== short test summary info ============================
2025-12-04T12:52:45.8023326Z FAILED [8.9164s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8023373Z Traceback (most recent call last):
2025-12-04T12:52:45.8023536Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8023580Z     getattr(self, test_name)()
2025-12-04T12:52:45.8023739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8023775Z     fn()
2025-12-04T12:52:45.8023925Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8023966Z     method(*args, **kwargs)
2025-12-04T12:52:45.8024132Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8024175Z     method(*args, **kwargs)
2025-12-04T12:52:45.8024325Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8024365Z     with policy():
2025-12-04T12:52:45.8024515Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8024557Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8024953Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8024956Z 
2025-12-04T12:52:45.8025032Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8025305Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8025334Z 
2025-12-04T12:52:45.8025422Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8025486Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8025549Z ======================= 1 failed, 4 deselected in 8.93s ========================
2025-12-04T12:52:45.8025586Z Got exit code 1
2025-12-04T12:52:45.8025628Z Retrying single test...
2025-12-04T12:52:45.8025820Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-57e00d741dcc51b9.xml
2025-12-04T12:52:45.8025878Z ============================= test session starts ==============================
2025-12-04T12:52:45.8025991Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8026031Z cachedir: .pytest_cache
2025-12-04T12:52:45.8026191Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8026237Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8026277Z configfile: pytest.ini
2025-12-04T12:52:45.8026438Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8026512Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8026775Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8026820Z Running 1 items in this shard
2025-12-04T12:52:45.8026822Z 
2025-12-04T12:52:45.8027168Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda I1204 12:49:30.559000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 501150
2025-12-04T12:52:45.8027326Z I1204 12:49:30.560000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 501151
2025-12-04T12:52:45.8027479Z I1204 12:49:30.561000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 501152
2025-12-04T12:52:45.8027629Z I1204 12:49:30.562000 501081 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 501153
2025-12-04T12:52:45.8028136Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8028247Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8028733Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8028813Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8029297Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8029369Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8029855Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8029928Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8030073Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8030237Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8030526Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8030681Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8030966Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8031089Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8031368Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8031516Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8031795Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8031943Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8032216Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8032365Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8032642Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8032792Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8033316Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8033434Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8033630Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8034045Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8034171Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8034382Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8034546Z [rank2]:E1204 12:49:37.663000 501152 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8034585Z dist init r=2, world=4
2025-12-04T12:52:45.8034725Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8034885Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8035173Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8035327Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8035612Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8035739Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8036015Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8036164Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8036441Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8036602Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8036878Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8037015Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8037294Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8037451Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8037964Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8038098Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8038324Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8038724Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8038839Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8039255Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8039420Z [rank3]:E1204 12:49:37.665000 501153 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8039459Z dist init r=3, world=4
2025-12-04T12:52:45.8039596Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8039757Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8040043Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8040197Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8040482Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8040606Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8040885Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8041035Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8041326Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8041476Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8041750Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8041908Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8042185Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8042333Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8042859Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8042989Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8043184Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8043585Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8043702Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8043911Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8044077Z [rank1]:E1204 12:49:37.796000 501151 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8044116Z dist init r=1, world=4
2025-12-04T12:52:45.8044256Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8044414Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8044699Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8044855Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8045139Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8045263Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8045549Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8045700Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8045977Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8046132Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8046415Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8046551Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8046829Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8046997Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8047509Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8047624Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8047820Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8048269Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8048384Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8048594Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8048757Z [rank0]:E1204 12:49:37.892000 501150 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8048797Z dist init r=0, world=4
2025-12-04T12:52:45.8049137Z [rank0]:[W1204 12:49:38.838474802 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8049179Z FAILED [8.9135s] [100%]
2025-12-04T12:52:45.8049181Z 
2025-12-04T12:52:45.8049237Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8049371Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda _
2025-12-04T12:52:45.8049419Z Traceback (most recent call last):
2025-12-04T12:52:45.8049595Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8049641Z     self._join_processes(fn)
2025-12-04T12:52:45.8049813Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8049870Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8050047Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8050091Z     raise RuntimeError(error)
2025-12-04T12:52:45.8050170Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8050216Z Traceback (most recent call last):
2025-12-04T12:52:45.8050387Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8050432Z     getattr(self, test_name)()
2025-12-04T12:52:45.8050590Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8050626Z     fn()
2025-12-04T12:52:45.8050778Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8050845Z     method(*args, **kwargs)
2025-12-04T12:52:45.8050995Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8051036Z     method(*args, **kwargs)
2025-12-04T12:52:45.8051185Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8051223Z     with policy():
2025-12-04T12:52:45.8051375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8051416Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8051805Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8051809Z 
2025-12-04T12:52:45.8051884Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8052155Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8052157Z 
2025-12-04T12:52:45.8052245Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8052248Z 
2025-12-04T12:52:45.8052249Z 
2025-12-04T12:52:45.8052325Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8052413Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8052647Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-57e00d741dcc51b9.xml -
2025-12-04T12:52:45.8052710Z =========================== short test summary info ============================
2025-12-04T12:52:45.8052993Z FAILED [8.9135s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8053042Z Traceback (most recent call last):
2025-12-04T12:52:45.8053206Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8053251Z     getattr(self, test_name)()
2025-12-04T12:52:45.8053409Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8053455Z     fn()
2025-12-04T12:52:45.8053606Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8053649Z     method(*args, **kwargs)
2025-12-04T12:52:45.8053800Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8053842Z     method(*args, **kwargs)
2025-12-04T12:52:45.8053992Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8054029Z     with policy():
2025-12-04T12:52:45.8054189Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8054232Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8054619Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8054632Z 
2025-12-04T12:52:45.8054705Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8054987Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8054989Z 
2025-12-04T12:52:45.8055076Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8055142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8055204Z ======================= 1 failed, 9 deselected in 8.92s ========================
2025-12-04T12:52:45.8055243Z Got exit code 1
2025-12-04T12:52:45.8055283Z Retrying single test...
2025-12-04T12:52:45.8055471Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-0d51898b7f977c61.xml
2025-12-04T12:52:45.8055529Z ============================= test session starts ==============================
2025-12-04T12:52:45.8055644Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8055684Z cachedir: .pytest_cache
2025-12-04T12:52:45.8055842Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8055887Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8055928Z configfile: pytest.ini
2025-12-04T12:52:45.8056090Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8056165Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8056429Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8056474Z Running 1 items in this shard
2025-12-04T12:52:45.8056476Z 
2025-12-04T12:52:45.8056819Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda I1204 12:49:42.111000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 501552
2025-12-04T12:52:45.8056972Z I1204 12:49:42.112000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 501553
2025-12-04T12:52:45.8057125Z I1204 12:49:42.113000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 501554
2025-12-04T12:52:45.8057276Z I1204 12:49:42.113000 501483 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 501555
2025-12-04T12:52:45.8057791Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8057856Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8058414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8058476Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8058962Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8059045Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8059528Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8059587Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8059732Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8059893Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8060184Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8060339Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8060624Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8060750Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8061031Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8061178Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8061454Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8061600Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8061891Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8062031Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8062307Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8062465Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8062981Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8063096Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8063313Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8063712Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8063826Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8064037Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8064203Z [rank2]:E1204 12:49:49.136000 501554 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8064245Z dist init r=2, world=4
2025-12-04T12:52:45.8064382Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8064543Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8064829Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8064983Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8065265Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8065391Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8065669Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8065816Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8066101Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8066247Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8066524Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8066659Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8066945Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8067094Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8067605Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8067742Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8067938Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8068367Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8068480Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8068692Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8068855Z [rank1]:E1204 12:49:49.139000 501553 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8068895Z dist init r=1, world=4
2025-12-04T12:52:45.8069031Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8069191Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8069478Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8069633Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8069916Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8070041Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8070335Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8070481Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8070757Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8070906Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8071194Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8071332Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8071609Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8071791Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8072302Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8072416Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8072615Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8073010Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8073126Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8073337Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8073502Z [rank0]:E1204 12:49:49.221000 501552 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8073540Z dist init r=0, world=4
2025-12-04T12:52:45.8073678Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8073838Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8074125Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8074279Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8074572Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8074697Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8074975Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8075123Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8075408Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8075555Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8075832Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8075986Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8076263Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8076412Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8076924Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8077040Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8077237Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8077633Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8077747Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8077959Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8078122Z [rank3]:E1204 12:49:49.250000 501555 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8078199Z dist init r=3, world=4
2025-12-04T12:52:45.8078535Z [rank0]:[W1204 12:49:49.052112048 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8078577Z FAILED [8.9132s] [100%]
2025-12-04T12:52:45.8078579Z 
2025-12-04T12:52:45.8078635Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8078785Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda _
2025-12-04T12:52:45.8078832Z Traceback (most recent call last):
2025-12-04T12:52:45.8078993Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8079039Z     self._join_processes(fn)
2025-12-04T12:52:45.8079212Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8079266Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8079457Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8079501Z     raise RuntimeError(error)
2025-12-04T12:52:45.8079582Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8079627Z Traceback (most recent call last):
2025-12-04T12:52:45.8079788Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8079831Z     getattr(self, test_name)()
2025-12-04T12:52:45.8080000Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8080048Z     fn()
2025-12-04T12:52:45.8080198Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8080239Z     method(*args, **kwargs)
2025-12-04T12:52:45.8080390Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8080432Z     method(*args, **kwargs)
2025-12-04T12:52:45.8080581Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8080619Z     with policy():
2025-12-04T12:52:45.8080772Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8080814Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8081202Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8081206Z 
2025-12-04T12:52:45.8081281Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8081552Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8081554Z 
2025-12-04T12:52:45.8081642Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8081645Z 
2025-12-04T12:52:45.8081704Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.8081750Z Traceback (most recent call last):
2025-12-04T12:52:45.8081922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8081964Z     getattr(self, test_name)()
2025-12-04T12:52:45.8082123Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8082156Z     fn()
2025-12-04T12:52:45.8082309Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8082349Z     method(*args, **kwargs)
2025-12-04T12:52:45.8082500Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8082542Z     method(*args, **kwargs)
2025-12-04T12:52:45.8082702Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8082740Z     with policy():
2025-12-04T12:52:45.8082892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8082934Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8083329Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8083331Z 
2025-12-04T12:52:45.8083406Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8083674Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8083677Z 
2025-12-04T12:52:45.8083764Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8083777Z 
2025-12-04T12:52:45.8083778Z 
2025-12-04T12:52:45.8083865Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8083952Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8084185Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-0d51898b7f977c61.xml -
2025-12-04T12:52:45.8084247Z =========================== short test summary info ============================
2025-12-04T12:52:45.8084531Z FAILED [8.9132s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8084576Z Traceback (most recent call last):
2025-12-04T12:52:45.8084739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8084782Z     getattr(self, test_name)()
2025-12-04T12:52:45.8084943Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8084978Z     fn()
2025-12-04T12:52:45.8085128Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8085167Z     method(*args, **kwargs)
2025-12-04T12:52:45.8085319Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8085359Z     method(*args, **kwargs)
2025-12-04T12:52:45.8085512Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8085549Z     with policy():
2025-12-04T12:52:45.8085701Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8085743Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8086129Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8086131Z 
2025-12-04T12:52:45.8086207Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8086478Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8086490Z 
2025-12-04T12:52:45.8086578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8086580Z 
2025-12-04T12:52:45.8086640Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.8086687Z Traceback (most recent call last):
2025-12-04T12:52:45.8086850Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8086893Z     getattr(self, test_name)()
2025-12-04T12:52:45.8087050Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8087088Z     fn()
2025-12-04T12:52:45.8087250Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8087291Z     method(*args, **kwargs)
2025-12-04T12:52:45.8087443Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8087482Z     method(*args, **kwargs)
2025-12-04T12:52:45.8087631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8087696Z     with policy():
2025-12-04T12:52:45.8087847Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8087887Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8088337Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8088340Z 
2025-12-04T12:52:45.8088412Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8088682Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8088685Z 
2025-12-04T12:52:45.8088770Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8088837Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8088900Z ======================= 1 failed, 9 deselected in 8.92s ========================
2025-12-04T12:52:45.8088939Z Got exit code 1
2025-12-04T12:52:45.8089159Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda
2025-12-04T12:52:45.8089289Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.8089477Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-959385e2c420bf4a.xml
2025-12-04T12:52:45.8089536Z ============================= test session starts ==============================
2025-12-04T12:52:45.8089649Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8089691Z cachedir: .pytest_cache
2025-12-04T12:52:45.8089852Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8089899Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8089942Z configfile: pytest.ini
2025-12-04T12:52:45.8090103Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8090175Z collecting ... collected 10 items / 5 deselected / 5 selected
2025-12-04T12:52:45.8090228Z stepcurrent: skipping 5 already run items.
2025-12-04T12:52:45.8090287Z Running 5 items in this shard
2025-12-04T12:52:45.8090289Z 
2025-12-04T12:52:45.8090633Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda I1204 12:49:53.530000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 501954
2025-12-04T12:52:45.8090792Z I1204 12:49:53.531000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 501955
2025-12-04T12:52:45.8090944Z I1204 12:49:53.532000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 501956
2025-12-04T12:52:45.8091111Z I1204 12:49:53.532000 501885 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 501957
2025-12-04T12:52:45.8091608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8091681Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8092183Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8092244Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8092729Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8092789Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8093275Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8093332Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8093476Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8093639Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8093927Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8094083Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8094370Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8094493Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8094782Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8094931Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8095212Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8095369Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8095646Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8095784Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8096061Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8096229Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8096742Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8096859Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8097054Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8097457Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8097577Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8097786Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8097954Z [rank3]:E1204 12:50:00.604000 501957 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8097993Z dist init r=3, world=4
2025-12-04T12:52:45.8098130Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8098318Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8098607Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8098760Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8099060Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8099187Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8099464Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8099612Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8099901Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8100051Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8100326Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8100486Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8100765Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8100912Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8101427Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8101543Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8101737Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8102139Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8102254Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8102466Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8102630Z [rank1]:E1204 12:50:00.614000 501955 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8102672Z dist init r=1, world=4
2025-12-04T12:52:45.8102809Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8102971Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8103264Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8103418Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8103704Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8103828Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8104123Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8104272Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8104550Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8104714Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8104988Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8105127Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8105406Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8105556Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8106066Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8106184Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8106378Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8106779Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8106894Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8107105Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8107273Z [rank2]:E1204 12:50:00.621000 501956 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8107310Z dist init r=2, world=4
2025-12-04T12:52:45.8107460Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8107617Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8107905Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8108058Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8108391Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8108517Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8108793Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8108966Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8109241Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8109389Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8109665Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8109802Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8110081Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8110230Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8110742Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8110854Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8111049Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8111450Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8111569Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8111795Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8111960Z [rank0]:E1204 12:50:00.747000 501954 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8112001Z dist init r=0, world=4
2025-12-04T12:52:45.8112338Z [rank0]:[W1204 12:50:00.601664434 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8112378Z FAILED [9.0149s] [ 20%]
2025-12-04T12:52:45.8112381Z 
2025-12-04T12:52:45.8112447Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8112584Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda _
2025-12-04T12:52:45.8112631Z Traceback (most recent call last):
2025-12-04T12:52:45.8112796Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8112839Z     self._join_processes(fn)
2025-12-04T12:52:45.8113011Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8113085Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8113263Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8113308Z     raise RuntimeError(error)
2025-12-04T12:52:45.8113388Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8113435Z Traceback (most recent call last):
2025-12-04T12:52:45.8113597Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8113640Z     getattr(self, test_name)()
2025-12-04T12:52:45.8113801Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8113836Z     fn()
2025-12-04T12:52:45.8113988Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8114032Z     method(*args, **kwargs)
2025-12-04T12:52:45.8114183Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8114225Z     method(*args, **kwargs)
2025-12-04T12:52:45.8114376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8114416Z     with policy():
2025-12-04T12:52:45.8114566Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8114609Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8114994Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8114998Z 
2025-12-04T12:52:45.8115073Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8115344Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8115347Z 
2025-12-04T12:52:45.8115436Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8115439Z 
2025-12-04T12:52:45.8115500Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8115544Z Traceback (most recent call last):
2025-12-04T12:52:45.8115716Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8115760Z     getattr(self, test_name)()
2025-12-04T12:52:45.8115920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8115957Z     fn()
2025-12-04T12:52:45.8116110Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8116149Z     method(*args, **kwargs)
2025-12-04T12:52:45.8116309Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8116348Z     method(*args, **kwargs)
2025-12-04T12:52:45.8116500Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8116537Z     with policy():
2025-12-04T12:52:45.8116692Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8116732Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8117130Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8117144Z 
2025-12-04T12:52:45.8117220Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8117488Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8117490Z 
2025-12-04T12:52:45.8117578Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8117581Z 
2025-12-04T12:52:45.8117583Z 
2025-12-04T12:52:45.8117657Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8117747Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8117981Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-959385e2c420bf4a.xml -
2025-12-04T12:52:45.8118043Z =========================== short test summary info ============================
2025-12-04T12:52:45.8118358Z FAILED [9.0149s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8118408Z Traceback (most recent call last):
2025-12-04T12:52:45.8118573Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8118615Z     getattr(self, test_name)()
2025-12-04T12:52:45.8118775Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8118811Z     fn()
2025-12-04T12:52:45.8118963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8119002Z     method(*args, **kwargs)
2025-12-04T12:52:45.8119152Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8119192Z     method(*args, **kwargs)
2025-12-04T12:52:45.8119341Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8119379Z     with policy():
2025-12-04T12:52:45.8119550Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8119592Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8119981Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8119985Z 
2025-12-04T12:52:45.8120059Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8120342Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8120345Z 
2025-12-04T12:52:45.8120432Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8120434Z 
2025-12-04T12:52:45.8120492Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8120538Z Traceback (most recent call last):
2025-12-04T12:52:45.8120700Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8120767Z     getattr(self, test_name)()
2025-12-04T12:52:45.8120924Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8120959Z     fn()
2025-12-04T12:52:45.8121108Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8121148Z     method(*args, **kwargs)
2025-12-04T12:52:45.8121298Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8121337Z     method(*args, **kwargs)
2025-12-04T12:52:45.8121487Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8121527Z     with policy():
2025-12-04T12:52:45.8121676Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8121722Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8122103Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8122107Z 
2025-12-04T12:52:45.8122181Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8124287Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8124290Z 
2025-12-04T12:52:45.8124385Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8124449Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8124518Z ======================= 1 failed, 5 deselected in 9.02s ========================
2025-12-04T12:52:45.8124556Z Got exit code 1
2025-12-04T12:52:45.8124597Z Retrying single test...
2025-12-04T12:52:45.8124789Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bf894163f060bb1c.xml
2025-12-04T12:52:45.8124847Z ============================= test session starts ==============================
2025-12-04T12:52:45.8124961Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8125002Z cachedir: .pytest_cache
2025-12-04T12:52:45.8125182Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8125229Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8125270Z configfile: pytest.ini
2025-12-04T12:52:45.8125433Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8125508Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8125771Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8125815Z Running 1 items in this shard
2025-12-04T12:52:45.8125831Z 
2025-12-04T12:52:45.8126178Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda I1204 12:50:04.916000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 502356
2025-12-04T12:52:45.8126333Z I1204 12:50:04.917000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 502357
2025-12-04T12:52:45.8126501Z I1204 12:50:04.918000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 502358
2025-12-04T12:52:45.8126664Z I1204 12:50:04.919000 502287 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 502359
2025-12-04T12:52:45.8127169Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8127232Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8127723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8127786Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8128310Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8128368Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8128855Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8128914Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8129058Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8129222Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8129538Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8129694Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8129981Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8130108Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8130398Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8130547Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8130825Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8130998Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8131273Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8131411Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8131690Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8131839Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8132356Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8132473Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8132669Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8133068Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8133184Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8133395Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8133561Z [rank2]:E1204 12:50:11.968000 502358 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8133600Z dist init r=2, world=4
2025-12-04T12:52:45.8133739Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8133914Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8134204Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8134360Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8134654Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8134780Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8135056Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8135213Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8135511Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8135658Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8135933Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8136069Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8136349Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8136499Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8137012Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2464153600 and is now 3196059648.
2025-12-04T12:52:45.8137127Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8137321Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8137721Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8137835Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8138047Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8138271Z [rank0]:E1204 12:50:12.076000 502356 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8138313Z dist init r=0, world=4
2025-12-04T12:52:45.8138451Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8138611Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8138914Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8139067Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8139351Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8139486Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8139775Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8139922Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8140197Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8140344Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8140617Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8140756Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8141034Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8141184Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8141695Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8141811Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8142005Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8142409Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8142523Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8142732Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8142897Z [rank3]:E1204 12:50:12.158000 502359 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8142935Z dist init r=3, world=4
2025-12-04T12:52:45.8143075Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8143245Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8143533Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8143686Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8143990Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8144113Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8144388Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8144536Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8144811Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8144959Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8145234Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8145369Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8145648Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8145795Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8146307Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8146421Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8146625Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8147020Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8147135Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8147356Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8147519Z [rank1]:E1204 12:50:12.178000 502357 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8147558Z dist init r=1, world=4
2025-12-04T12:52:45.8147895Z [rank0]:[W1204 12:50:12.922941439 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8147945Z FAILED [8.9130s] [100%]
2025-12-04T12:52:45.8147957Z 
2025-12-04T12:52:45.8148014Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8148182Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda _
2025-12-04T12:52:45.8148229Z Traceback (most recent call last):
2025-12-04T12:52:45.8148393Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8148437Z     self._join_processes(fn)
2025-12-04T12:52:45.8148609Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8148665Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8148841Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8148886Z     raise RuntimeError(error)
2025-12-04T12:52:45.8148967Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8149012Z Traceback (most recent call last):
2025-12-04T12:52:45.8149171Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8149214Z     getattr(self, test_name)()
2025-12-04T12:52:45.8149372Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8149407Z     fn()
2025-12-04T12:52:45.8149558Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8149600Z     method(*args, **kwargs)
2025-12-04T12:52:45.8149750Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8149791Z     method(*args, **kwargs)
2025-12-04T12:52:45.8149941Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8149978Z     with policy():
2025-12-04T12:52:45.8150129Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8150171Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8150559Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2464153600 and is now 3196059648.
2025-12-04T12:52:45.8150575Z 
2025-12-04T12:52:45.8150650Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8150919Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8150923Z 
2025-12-04T12:52:45.8151010Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8151012Z 
2025-12-04T12:52:45.8151072Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.8151116Z Traceback (most recent call last):
2025-12-04T12:52:45.8151293Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8151335Z     getattr(self, test_name)()
2025-12-04T12:52:45.8151495Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8151529Z     fn()
2025-12-04T12:52:45.8151679Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8151741Z     method(*args, **kwargs)
2025-12-04T12:52:45.8151903Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8151943Z     method(*args, **kwargs)
2025-12-04T12:52:45.8152091Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8152128Z     with policy():
2025-12-04T12:52:45.8152279Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8152320Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8152708Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8152711Z 
2025-12-04T12:52:45.8152785Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8153054Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8153057Z 
2025-12-04T12:52:45.8153144Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8153147Z 
2025-12-04T12:52:45.8153149Z 
2025-12-04T12:52:45.8153225Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8153312Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8153547Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-bf894163f060bb1c.xml -
2025-12-04T12:52:45.8153607Z =========================== short test summary info ============================
2025-12-04T12:52:45.8153890Z FAILED [8.9130s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8153936Z Traceback (most recent call last):
2025-12-04T12:52:45.8154099Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8154141Z     getattr(self, test_name)()
2025-12-04T12:52:45.8154300Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8154334Z     fn()
2025-12-04T12:52:45.8154496Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8154535Z     method(*args, **kwargs)
2025-12-04T12:52:45.8154688Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8154727Z     method(*args, **kwargs)
2025-12-04T12:52:45.8154877Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8154913Z     with policy():
2025-12-04T12:52:45.8155075Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8155115Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8155501Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2464153600 and is now 3196059648.
2025-12-04T12:52:45.8155504Z 
2025-12-04T12:52:45.8155587Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8155865Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8155867Z 
2025-12-04T12:52:45.8155954Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8155956Z 
2025-12-04T12:52:45.8156016Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.8156062Z Traceback (most recent call last):
2025-12-04T12:52:45.8156223Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8156266Z     getattr(self, test_name)()
2025-12-04T12:52:45.8156424Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8156458Z     fn()
2025-12-04T12:52:45.8156608Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8156649Z     method(*args, **kwargs)
2025-12-04T12:52:45.8156799Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8156838Z     method(*args, **kwargs)
2025-12-04T12:52:45.8156988Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8157024Z     with policy():
2025-12-04T12:52:45.8157175Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8157215Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8157602Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8157606Z 
2025-12-04T12:52:45.8157677Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8157944Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8157947Z 
2025-12-04T12:52:45.8158033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8158097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8158216Z ======================= 1 failed, 9 deselected in 8.92s ========================
2025-12-04T12:52:45.8158254Z Got exit code 1
2025-12-04T12:52:45.8158293Z Retrying single test...
2025-12-04T12:52:45.8158484Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a6fd76a4add03bc4.xml
2025-12-04T12:52:45.8158545Z ============================= test session starts ==============================
2025-12-04T12:52:45.8158655Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8158697Z cachedir: .pytest_cache
2025-12-04T12:52:45.8158868Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8158916Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8158955Z configfile: pytest.ini
2025-12-04T12:52:45.8159118Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8159191Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8159453Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8159522Z Running 1 items in this shard
2025-12-04T12:52:45.8159524Z 
2025-12-04T12:52:45.8159868Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda I1204 12:50:16.506000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 502758
2025-12-04T12:52:45.8160023Z I1204 12:50:16.507000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 502759
2025-12-04T12:52:45.8160176Z I1204 12:50:16.508000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 502760
2025-12-04T12:52:45.8160326Z I1204 12:50:16.509000 502689 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 502761
2025-12-04T12:52:45.8160821Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8160884Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8161371Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8161431Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8161917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8161976Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8162470Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8162527Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8162671Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8162833Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8163134Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8163288Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8163574Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8163699Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8163993Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8164141Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8164418Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8164566Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8164839Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8164978Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8165256Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8165404Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8165918Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8166035Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8166231Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8166632Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8166755Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8166967Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8167131Z [rank1]:E1204 12:50:23.690000 502759 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8167170Z dist init r=1, world=4
2025-12-04T12:52:45.8167307Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8167487Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8167774Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8167927Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8168256Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8168380Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8168657Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8168804Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8169081Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8169228Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8169503Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8169640Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8169916Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8170064Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8170578Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8170694Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8170887Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8171308Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8171425Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8171635Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8171813Z [rank2]:E1204 12:50:23.886000 502760 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8171852Z dist init r=2, world=4
2025-12-04T12:52:45.8171991Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8172150Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8172452Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8172617Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8172904Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8173028Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8173304Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8173455Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8173731Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8173879Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8174154Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8174290Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8174568Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8174715Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8175235Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8175348Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8175544Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8175941Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8176065Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8176276Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8176439Z [rank3]:E1204 12:50:23.917000 502761 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8176487Z dist init r=3, world=4
2025-12-04T12:52:45.8176632Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8176791Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8177077Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8177230Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8177515Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8177639Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8177915Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8178062Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8178379Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8178525Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8178801Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8178937Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8179214Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8179362Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8179884Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8179999Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8180205Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8180602Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8180716Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8180938Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8181114Z [rank0]:E1204 12:50:23.988000 502758 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8181152Z dist init r=0, world=4
2025-12-04T12:52:45.8181488Z [rank0]:[W1204 12:50:24.070318785 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8181527Z FAILED [9.1173s] [100%]
2025-12-04T12:52:45.8181529Z 
2025-12-04T12:52:45.8181585Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8181718Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda _
2025-12-04T12:52:45.8181766Z Traceback (most recent call last):
2025-12-04T12:52:45.8181929Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8181973Z     self._join_processes(fn)
2025-12-04T12:52:45.8182144Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8182200Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8182378Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8182422Z     raise RuntimeError(error)
2025-12-04T12:52:45.8182506Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8182550Z Traceback (most recent call last):
2025-12-04T12:52:45.8182711Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8182754Z     getattr(self, test_name)()
2025-12-04T12:52:45.8182912Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8182946Z     fn()
2025-12-04T12:52:45.8183099Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8183140Z     method(*args, **kwargs)
2025-12-04T12:52:45.8183290Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8183329Z     method(*args, **kwargs)
2025-12-04T12:52:45.8183499Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8183536Z     with policy():
2025-12-04T12:52:45.8183688Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8183730Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8184117Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8184119Z 
2025-12-04T12:52:45.8184204Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8184476Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8184478Z 
2025-12-04T12:52:45.8184568Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8184582Z 
2025-12-04T12:52:45.8184583Z 
2025-12-04T12:52:45.8184659Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8184757Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8184992Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a6fd76a4add03bc4.xml -
2025-12-04T12:52:45.8185053Z =========================== short test summary info ============================
2025-12-04T12:52:45.8185335Z FAILED [9.1173s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8185383Z Traceback (most recent call last):
2025-12-04T12:52:45.8185546Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8185590Z     getattr(self, test_name)()
2025-12-04T12:52:45.8185751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8185784Z     fn()
2025-12-04T12:52:45.8185936Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8185976Z     method(*args, **kwargs)
2025-12-04T12:52:45.8186127Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8186166Z     method(*args, **kwargs)
2025-12-04T12:52:45.8186316Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8186353Z     with policy():
2025-12-04T12:52:45.8186504Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8186545Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8186934Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8186936Z 
2025-12-04T12:52:45.8187010Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8187281Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8187283Z 
2025-12-04T12:52:45.8187380Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8187445Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8187509Z ======================= 1 failed, 9 deselected in 9.13s ========================
2025-12-04T12:52:45.8187547Z Got exit code 1
2025-12-04T12:52:45.8187766Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda
2025-12-04T12:52:45.8187894Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.8188093Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-fc624f2ff706e807.xml
2025-12-04T12:52:45.8188184Z ============================= test session starts ==============================
2025-12-04T12:52:45.8188299Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8188340Z cachedir: .pytest_cache
2025-12-04T12:52:45.8188497Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8188576Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8188617Z configfile: pytest.ini
2025-12-04T12:52:45.8188777Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8188850Z collecting ... collected 10 items / 6 deselected / 4 selected
2025-12-04T12:52:45.8188902Z stepcurrent: skipping 6 already run items.
2025-12-04T12:52:45.8188947Z Running 4 items in this shard
2025-12-04T12:52:45.8188949Z 
2025-12-04T12:52:45.8189290Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda I1204 12:50:28.562000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 503160
2025-12-04T12:52:45.8189444Z I1204 12:50:28.563000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 503161
2025-12-04T12:52:45.8189598Z I1204 12:50:28.564000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 503162
2025-12-04T12:52:45.8189750Z I1204 12:50:28.565000 503091 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 503163
2025-12-04T12:52:45.8190247Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8190310Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8190795Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8190857Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8191340Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8191414Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8191902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8191961Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8192104Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8192282Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8192572Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8192726Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8193031Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8193157Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8193435Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8193583Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8193859Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8194006Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8194282Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8194420Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8194703Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8194852Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8195367Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8195484Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8195679Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8196085Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8196202Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8196412Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8196585Z [rank1]:E1204 12:50:35.632000 503161 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8196624Z dist init r=1, world=4
2025-12-04T12:52:45.8196763Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8196922Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8197222Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8197394Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8197677Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8197802Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8198079Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8198267Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8198541Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8198689Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8198964Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8199099Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8199381Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8199530Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8200057Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8200171Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8200367Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8200761Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8200886Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8201097Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8201259Z [rank3]:E1204 12:50:35.763000 503163 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8201309Z dist init r=3, world=4
2025-12-04T12:52:45.8201459Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8201618Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8201906Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8202061Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8202344Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8202468Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8202744Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8202892Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8203168Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8203314Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8203589Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8203726Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8204005Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8204153Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8204670Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8204786Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8204989Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8205383Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8205496Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8205714Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8205889Z [rank0]:E1204 12:50:35.768000 503160 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8205927Z dist init r=0, world=4
2025-12-04T12:52:45.8206065Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8206224Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8206512Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8206665Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8206949Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8207075Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8207348Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8207496Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8207769Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8207918Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8208232Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8208369Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8208660Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8208809Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8209331Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8209444Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8209639Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8210033Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8210168Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8210379Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8210542Z [rank2]:E1204 12:50:35.784000 503162 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8210581Z dist init r=2, world=4
2025-12-04T12:52:45.8210919Z [rank0]:[W1204 12:50:36.713346171 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8210961Z FAILED [9.3136s] [ 25%]
2025-12-04T12:52:45.8210963Z 
2025-12-04T12:52:45.8211019Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8211153Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda _
2025-12-04T12:52:45.8211199Z Traceback (most recent call last):
2025-12-04T12:52:45.8211362Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8211407Z     self._join_processes(fn)
2025-12-04T12:52:45.8211579Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8211636Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8211813Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8211858Z     raise RuntimeError(error)
2025-12-04T12:52:45.8211938Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8211983Z Traceback (most recent call last):
2025-12-04T12:52:45.8212145Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8212188Z     getattr(self, test_name)()
2025-12-04T12:52:45.8212345Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8212380Z     fn()
2025-12-04T12:52:45.8212539Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8212580Z     method(*args, **kwargs)
2025-12-04T12:52:45.8212730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8212771Z     method(*args, **kwargs)
2025-12-04T12:52:45.8212920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8212957Z     with policy():
2025-12-04T12:52:45.8213110Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8213151Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8213549Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8213552Z 
2025-12-04T12:52:45.8213627Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8213910Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8213924Z 
2025-12-04T12:52:45.8214012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8214014Z 
2025-12-04T12:52:45.8214074Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8214118Z Traceback (most recent call last):
2025-12-04T12:52:45.8214282Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8214323Z     getattr(self, test_name)()
2025-12-04T12:52:45.8214483Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8214517Z     fn()
2025-12-04T12:52:45.8214668Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8214709Z     method(*args, **kwargs)
2025-12-04T12:52:45.8214859Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8214897Z     method(*args, **kwargs)
2025-12-04T12:52:45.8215047Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8215083Z     with policy():
2025-12-04T12:52:45.8215237Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8215277Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8215664Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8215668Z 
2025-12-04T12:52:45.8215742Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8216009Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8216011Z 
2025-12-04T12:52:45.8216100Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8216102Z 
2025-12-04T12:52:45.8216159Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.8216204Z Traceback (most recent call last):
2025-12-04T12:52:45.8216374Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8216417Z     getattr(self, test_name)()
2025-12-04T12:52:45.8216574Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8216609Z     fn()
2025-12-04T12:52:45.8216759Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8216798Z     method(*args, **kwargs)
2025-12-04T12:52:45.8216947Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8216997Z     method(*args, **kwargs)
2025-12-04T12:52:45.8217147Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8217183Z     with policy():
2025-12-04T12:52:45.8217334Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8217375Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8217758Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8217782Z 
2025-12-04T12:52:45.8217854Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8218122Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8218124Z 
2025-12-04T12:52:45.8218248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8218250Z 
2025-12-04T12:52:45.8218253Z 
2025-12-04T12:52:45.8218329Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8218417Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8218651Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-fc624f2ff706e807.xml -
2025-12-04T12:52:45.8218712Z =========================== short test summary info ============================
2025-12-04T12:52:45.8218993Z FAILED [9.3136s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8219039Z Traceback (most recent call last):
2025-12-04T12:52:45.8219201Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8219244Z     getattr(self, test_name)()
2025-12-04T12:52:45.8219403Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8219438Z     fn()
2025-12-04T12:52:45.8219588Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8219629Z     method(*args, **kwargs)
2025-12-04T12:52:45.8219778Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8219817Z     method(*args, **kwargs)
2025-12-04T12:52:45.8219966Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8220010Z     with policy():
2025-12-04T12:52:45.8220163Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8220217Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8220601Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8220605Z 
2025-12-04T12:52:45.8220677Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8220955Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8220957Z 
2025-12-04T12:52:45.8221043Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8221045Z 
2025-12-04T12:52:45.8221104Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8221149Z Traceback (most recent call last):
2025-12-04T12:52:45.8221312Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8221368Z     getattr(self, test_name)()
2025-12-04T12:52:45.8221541Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8221574Z     fn()
2025-12-04T12:52:45.8221725Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8221764Z     method(*args, **kwargs)
2025-12-04T12:52:45.8221914Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8221952Z     method(*args, **kwargs)
2025-12-04T12:52:45.8222102Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8222140Z     with policy():
2025-12-04T12:52:45.8222290Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8222332Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8222717Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8222719Z 
2025-12-04T12:52:45.8222791Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8223056Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8223058Z 
2025-12-04T12:52:45.8223145Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8223147Z 
2025-12-04T12:52:45.8223204Z Process 2 exited with error code 10 and exception:
2025-12-04T12:52:45.8223251Z Traceback (most recent call last):
2025-12-04T12:52:45.8223412Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8223454Z     getattr(self, test_name)()
2025-12-04T12:52:45.8223613Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8223646Z     fn()
2025-12-04T12:52:45.8223797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8223836Z     method(*args, **kwargs)
2025-12-04T12:52:45.8223985Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8224036Z     method(*args, **kwargs)
2025-12-04T12:52:45.8224185Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8224223Z     with policy():
2025-12-04T12:52:45.8224374Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8224414Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8224809Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8224812Z 
2025-12-04T12:52:45.8224884Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8225151Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8225163Z 
2025-12-04T12:52:45.8225248Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8225322Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8225387Z ======================= 1 failed, 6 deselected in 9.32s ========================
2025-12-04T12:52:45.8225423Z Got exit code 1
2025-12-04T12:52:45.8225463Z Retrying single test...
2025-12-04T12:52:45.8225654Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b62b5adc14a929a0.xml
2025-12-04T12:52:45.8225711Z ============================= test session starts ==============================
2025-12-04T12:52:45.8225822Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8225863Z cachedir: .pytest_cache
2025-12-04T12:52:45.8226020Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8226067Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8226108Z configfile: pytest.ini
2025-12-04T12:52:45.8226271Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8226343Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8226605Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8226648Z Running 1 items in this shard
2025-12-04T12:52:45.8226650Z 
2025-12-04T12:52:45.8226993Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda I1204 12:50:40.280000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 503562
2025-12-04T12:52:45.8227149Z I1204 12:50:40.282000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 503563
2025-12-04T12:52:45.8227302Z I1204 12:50:40.282000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 503564
2025-12-04T12:52:45.8227454Z I1204 12:50:40.283000 503493 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 503565
2025-12-04T12:52:45.8227959Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8228021Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8228546Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8228610Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8229115Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8229173Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8229660Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8229742Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8229886Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8230047Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8230336Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8230491Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8230776Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8230903Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8231179Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8231329Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8231603Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8231753Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8232029Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8232165Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8232453Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8232601Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8233126Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8233241Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8233437Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8233835Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8233968Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8234182Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8234346Z [rank1]:E1204 12:50:47.252000 503563 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8234385Z dist init r=1, world=4
2025-12-04T12:52:45.8234525Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8234685Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8234974Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8235128Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8235413Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8235539Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8235816Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8235963Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8236240Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8236387Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8236673Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8236810Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8237088Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8237236Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8237756Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8237870Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8238085Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8238504Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8238618Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8238830Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8238994Z [rank0]:E1204 12:50:47.432000 503562 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8239033Z dist init r=0, world=4
2025-12-04T12:52:45.8239170Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8239328Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8239614Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8239767Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8240050Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8240176Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8240451Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8240599Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8240888Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8241036Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8241314Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8241450Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8241738Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8241887Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8242395Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8242532Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8242728Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8243122Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8243235Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8243448Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8243611Z [rank3]:E1204 12:50:47.434000 503565 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8243650Z dist init r=3, world=4
2025-12-04T12:52:45.8243787Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8243947Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8244233Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8244388Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8244671Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8244795Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8245085Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8245232Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8245508Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8245656Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8245944Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8246081Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8246357Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8246524Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8247033Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8247148Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8247343Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8247738Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8247852Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8248064Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8248279Z [rank2]:E1204 12:50:47.592000 503564 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8248317Z dist init r=2, world=4
2025-12-04T12:52:45.8248652Z [rank0]:[W1204 12:50:47.297044670 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8248692Z FAILED [8.8136s] [100%]
2025-12-04T12:52:45.8248694Z 
2025-12-04T12:52:45.8248751Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8248883Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda _
2025-12-04T12:52:45.8248931Z Traceback (most recent call last):
2025-12-04T12:52:45.8249093Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8249139Z     self._join_processes(fn)
2025-12-04T12:52:45.8249344Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8249398Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8249577Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8249622Z     raise RuntimeError(error)
2025-12-04T12:52:45.8249703Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8249748Z Traceback (most recent call last):
2025-12-04T12:52:45.8249923Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8249966Z     getattr(self, test_name)()
2025-12-04T12:52:45.8250124Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8250159Z     fn()
2025-12-04T12:52:45.8250311Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8250351Z     method(*args, **kwargs)
2025-12-04T12:52:45.8250519Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8250572Z     method(*args, **kwargs)
2025-12-04T12:52:45.8250722Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8250760Z     with policy():
2025-12-04T12:52:45.8250912Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8250953Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8251339Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8251341Z 
2025-12-04T12:52:45.8251417Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8251685Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8251687Z 
2025-12-04T12:52:45.8251776Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8251778Z 
2025-12-04T12:52:45.8251780Z 
2025-12-04T12:52:45.8251855Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8251943Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8252175Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-b62b5adc14a929a0.xml -
2025-12-04T12:52:45.8252236Z =========================== short test summary info ============================
2025-12-04T12:52:45.8252517Z FAILED [8.8136s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8252564Z Traceback (most recent call last):
2025-12-04T12:52:45.8252727Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8252770Z     getattr(self, test_name)()
2025-12-04T12:52:45.8252930Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8252963Z     fn()
2025-12-04T12:52:45.8253125Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8253165Z     method(*args, **kwargs)
2025-12-04T12:52:45.8253315Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8253357Z     method(*args, **kwargs)
2025-12-04T12:52:45.8253506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8253542Z     with policy():
2025-12-04T12:52:45.8253695Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8253734Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8254128Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8254131Z 
2025-12-04T12:52:45.8254204Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8254483Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8254495Z 
2025-12-04T12:52:45.8254583Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8254645Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8254708Z ======================= 1 failed, 9 deselected in 8.82s ========================
2025-12-04T12:52:45.8254745Z Got exit code 1
2025-12-04T12:52:45.8254785Z Retrying single test...
2025-12-04T12:52:45.8254974Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-687e88c5a858511f.xml
2025-12-04T12:52:45.8255064Z ============================= test session starts ==============================
2025-12-04T12:52:45.8255220Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8255264Z cachedir: .pytest_cache
2025-12-04T12:52:45.8255422Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8255469Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8255509Z configfile: pytest.ini
2025-12-04T12:52:45.8255674Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8255746Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8256009Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8256052Z Running 1 items in this shard
2025-12-04T12:52:45.8256054Z 
2025-12-04T12:52:45.8256394Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda I1204 12:50:51.842000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 503964
2025-12-04T12:52:45.8256550Z I1204 12:50:51.844000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 503965
2025-12-04T12:52:45.8256705Z I1204 12:50:51.844000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 503966
2025-12-04T12:52:45.8256857Z I1204 12:50:51.845000 503895 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 503967
2025-12-04T12:52:45.8257365Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8257431Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8257926Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8257990Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8258533Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8258627Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8259109Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8259166Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8259311Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8259473Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8259764Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8259921Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8260206Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8260331Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8260608Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8260757Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8261032Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8261181Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8261468Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8261605Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8261882Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8262031Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8262570Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8262686Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8262881Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8263306Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8263420Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8263631Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8263795Z [rank2]:E1204 12:50:58.887000 503966 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8263839Z dist init r=2, world=4
2025-12-04T12:52:45.8263978Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8264139Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8264425Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8264580Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8264866Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8264991Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8265267Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8265415Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8265691Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8265847Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8266121Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8266260Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8266547Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8266695Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8267206Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8267342Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8267535Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8267930Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8268044Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8268301Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8268466Z [rank3]:E1204 12:50:58.900000 503967 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8268504Z dist init r=3, world=4
2025-12-04T12:52:45.8268644Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8268801Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8269090Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8269244Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8269531Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8269657Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8269933Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8270092Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8270365Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8270516Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8270801Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8270937Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8271213Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8271372Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8271900Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8272013Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8272208Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8272603Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8272717Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8272928Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8273090Z [rank0]:E1204 12:50:59.087000 503964 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8273128Z dist init r=0, world=4
2025-12-04T12:52:45.8273266Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8273426Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8273712Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8273866Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8274151Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8274285Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8274560Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8274708Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8274994Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8275140Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8275415Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8275551Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8275851Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8275999Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8276511Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8276625Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8276820Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8277215Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8277329Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8277539Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8277703Z [rank1]:E1204 12:50:59.133000 503965 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8277742Z dist init r=1, world=4
2025-12-04T12:52:45.8278078Z [rank0]:[W1204 12:50:59.974819294 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8278117Z FAILED [8.9142s] [100%]
2025-12-04T12:52:45.8278119Z 
2025-12-04T12:52:45.8278214Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8278347Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda _
2025-12-04T12:52:45.8278393Z Traceback (most recent call last):
2025-12-04T12:52:45.8278571Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8278618Z     self._join_processes(fn)
2025-12-04T12:52:45.8278792Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8278846Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8279024Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8279066Z     raise RuntimeError(error)
2025-12-04T12:52:45.8279159Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8279204Z Traceback (most recent call last):
2025-12-04T12:52:45.8279366Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8279409Z     getattr(self, test_name)()
2025-12-04T12:52:45.8279568Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8279614Z     fn()
2025-12-04T12:52:45.8279765Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8279819Z     method(*args, **kwargs)
2025-12-04T12:52:45.8279972Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8280011Z     method(*args, **kwargs)
2025-12-04T12:52:45.8280162Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8280199Z     with policy():
2025-12-04T12:52:45.8280351Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8280392Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8280781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8280785Z 
2025-12-04T12:52:45.8280862Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8281133Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8281135Z 
2025-12-04T12:52:45.8281224Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8281226Z 
2025-12-04T12:52:45.8281228Z 
2025-12-04T12:52:45.8281303Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8281390Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8281620Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-687e88c5a858511f.xml -
2025-12-04T12:52:45.8281682Z =========================== short test summary info ============================
2025-12-04T12:52:45.8281961Z FAILED [8.9142s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8282008Z Traceback (most recent call last):
2025-12-04T12:52:45.8282171Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8282214Z     getattr(self, test_name)()
2025-12-04T12:52:45.8282385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8282419Z     fn()
2025-12-04T12:52:45.8282570Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8282612Z     method(*args, **kwargs)
2025-12-04T12:52:45.8282762Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8282801Z     method(*args, **kwargs)
2025-12-04T12:52:45.8282951Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8283000Z     with policy():
2025-12-04T12:52:45.8283154Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8283193Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8283580Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8283602Z 
2025-12-04T12:52:45.8283676Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8283945Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8283946Z 
2025-12-04T12:52:45.8284035Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8284097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8284159Z ======================= 1 failed, 9 deselected in 8.92s ========================
2025-12-04T12:52:45.8284196Z Got exit code 1
2025-12-04T12:52:45.8284415Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda
2025-12-04T12:52:45.8284542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.8284733Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-865ac1d538c948a6.xml
2025-12-04T12:52:45.8284789Z ============================= test session starts ==============================
2025-12-04T12:52:45.8284902Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8284942Z cachedir: .pytest_cache
2025-12-04T12:52:45.8285103Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8285148Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8285189Z configfile: pytest.ini
2025-12-04T12:52:45.8285350Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8285424Z collecting ... collected 10 items / 7 deselected / 3 selected
2025-12-04T12:52:45.8285477Z stepcurrent: skipping 7 already run items.
2025-12-04T12:52:45.8285523Z Running 3 items in this shard
2025-12-04T12:52:45.8285525Z 
2025-12-04T12:52:45.8285866Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda I1204 12:51:03.413000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 504366
2025-12-04T12:52:45.8286020Z I1204 12:51:03.414000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 504367
2025-12-04T12:52:45.8286181Z I1204 12:51:03.415000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 504368
2025-12-04T12:52:45.8286330Z I1204 12:51:03.415000 504297 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 504369
2025-12-04T12:52:45.8286829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8286901Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8287392Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8287452Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8287946Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8288015Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8288535Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8288592Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8288736Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8288899Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8289189Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8289343Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8289628Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8289753Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8290032Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8290182Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8290459Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8290621Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8290896Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8291034Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8291327Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8291475Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8291989Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8292130Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8292325Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8292726Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8292840Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8293052Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8293216Z [rank2]:E1204 12:51:10.515000 504368 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8293255Z dist init r=2, world=4
2025-12-04T12:52:45.8293397Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8293555Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8293842Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8293997Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8294281Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8294405Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8294683Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8294842Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8295117Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8295265Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8295550Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8295685Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8295962Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8296109Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8296639Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2243952640 and is now 2986344448.
2025-12-04T12:52:45.8296753Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8296949Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8297346Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8297461Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8297672Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8297836Z [rank3]:E1204 12:51:10.518000 504369 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8297876Z dist init r=3, world=4
2025-12-04T12:52:45.8298013Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8298208Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8298494Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8298648Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8298933Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8299070Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8299347Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8299496Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8299771Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8299929Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8300204Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8300340Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8300644Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8300792Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8301301Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8301415Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8301610Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8302007Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8302120Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8302330Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8302493Z [rank0]:E1204 12:51:10.593000 504366 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8302532Z dist init r=0, world=4
2025-12-04T12:52:45.8302671Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8302828Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8303114Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8303277Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8303559Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8303686Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8303960Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8304117Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8304393Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8304539Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8304824Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8304969Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8305246Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8305394Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8305902Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8306017Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8306213Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8306611Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8306723Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8306934Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8307099Z [rank1]:E1204 12:51:10.697000 504367 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8307137Z dist init r=1, world=4
2025-12-04T12:52:45.8307474Z [rank0]:[W1204 12:51:10.436636023 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8307514Z FAILED [9.0133s] [ 33%]
2025-12-04T12:52:45.8307532Z 
2025-12-04T12:52:45.8307588Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8307722Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda _
2025-12-04T12:52:45.8307769Z Traceback (most recent call last):
2025-12-04T12:52:45.8307931Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8307975Z     self._join_processes(fn)
2025-12-04T12:52:45.8308170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8308243Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8308420Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8308463Z     raise RuntimeError(error)
2025-12-04T12:52:45.8308544Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8308590Z Traceback (most recent call last):
2025-12-04T12:52:45.8308763Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8308819Z     getattr(self, test_name)()
2025-12-04T12:52:45.8308977Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8309011Z     fn()
2025-12-04T12:52:45.8309163Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8309203Z     method(*args, **kwargs)
2025-12-04T12:52:45.8309353Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8309394Z     method(*args, **kwargs)
2025-12-04T12:52:45.8309544Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8309582Z     with policy():
2025-12-04T12:52:45.8309734Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8309776Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8310159Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8310163Z 
2025-12-04T12:52:45.8310237Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8310506Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8310508Z 
2025-12-04T12:52:45.8310596Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8310599Z 
2025-12-04T12:52:45.8310601Z 
2025-12-04T12:52:45.8310677Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8310764Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8310999Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-865ac1d538c948a6.xml -
2025-12-04T12:52:45.8311059Z =========================== short test summary info ============================
2025-12-04T12:52:45.8311351Z FAILED [9.0133s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8311398Z Traceback (most recent call last):
2025-12-04T12:52:45.8311560Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8311604Z     getattr(self, test_name)()
2025-12-04T12:52:45.8311764Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8311799Z     fn()
2025-12-04T12:52:45.8311949Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8311989Z     method(*args, **kwargs)
2025-12-04T12:52:45.8312148Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8312190Z     method(*args, **kwargs)
2025-12-04T12:52:45.8312339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8312377Z     with policy():
2025-12-04T12:52:45.8312527Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8312588Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8312972Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8312975Z 
2025-12-04T12:52:45.8313050Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8313318Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8313322Z 
2025-12-04T12:52:45.8313409Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8313472Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8313534Z ======================= 1 failed, 7 deselected in 9.02s ========================
2025-12-04T12:52:45.8313572Z Got exit code 1
2025-12-04T12:52:45.8313611Z Retrying single test...
2025-12-04T12:52:45.8313798Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-681fc1044bc94934.xml
2025-12-04T12:52:45.8313855Z ============================= test session starts ==============================
2025-12-04T12:52:45.8313968Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8314007Z cachedir: .pytest_cache
2025-12-04T12:52:45.8314165Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8314210Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8314252Z configfile: pytest.ini
2025-12-04T12:52:45.8314413Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8314489Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8314751Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8314795Z Running 1 items in this shard
2025-12-04T12:52:45.8314797Z 
2025-12-04T12:52:45.8315137Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda I1204 12:51:14.971000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 504768
2025-12-04T12:52:45.8315304Z I1204 12:51:14.973000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 504769
2025-12-04T12:52:45.8315456Z I1204 12:51:14.974000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 504770
2025-12-04T12:52:45.8315609Z I1204 12:51:14.974000 504699 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 504771
2025-12-04T12:52:45.8316118Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8316181Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8316668Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8316752Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8317236Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8317294Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8317777Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8317836Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8317979Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8318143Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8318467Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8318621Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8318906Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8319031Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8319308Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8319456Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8319744Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8319893Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8320166Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8320319Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8320597Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8320746Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8321267Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2466250752 and is now 3196059648.
2025-12-04T12:52:45.8321402Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8321597Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8321992Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8322110Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8322322Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8322488Z [rank0]:E1204 12:51:22.139000 504768 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8322526Z dist init r=0, world=4
2025-12-04T12:52:45.8322667Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8322828Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8323115Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8323270Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8323554Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8323679Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8323963Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8324110Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8324385Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8324541Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8324817Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8324954Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8325232Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8325400Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8325910Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8326026Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8326220Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8326615Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8326730Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8326940Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8327103Z [rank3]:E1204 12:51:22.189000 504771 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8327144Z dist init r=3, world=4
2025-12-04T12:52:45.8327283Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8327445Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8327732Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8327885Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8328363Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8328489Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8328766Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8328913Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8329201Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8329348Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8329622Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8329788Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8330065Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8330213Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8330723Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8330838Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8331033Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8331425Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8331539Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8331748Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8331913Z [rank1]:E1204 12:51:22.241000 504769 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8331951Z dist init r=1, world=4
2025-12-04T12:52:45.8332091Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8332250Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8332545Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8332700Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8332984Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8333107Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8333390Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8333538Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8333814Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8333979Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8334254Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8334390Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8334670Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8334817Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8335327Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8335441Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8335636Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8336029Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8336144Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8336353Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8336517Z [rank2]:E1204 12:51:22.244000 504770 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8336556Z dist init r=2, world=4
2025-12-04T12:52:45.8336902Z [rank0]:[W1204 12:51:22.006850826 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8336943Z FAILED [9.0130s] [100%]
2025-12-04T12:52:45.8336946Z 
2025-12-04T12:52:45.8337002Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8337135Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda _
2025-12-04T12:52:45.8337183Z Traceback (most recent call last):
2025-12-04T12:52:45.8337354Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8337399Z     self._join_processes(fn)
2025-12-04T12:52:45.8337572Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8337627Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8337804Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8337863Z     raise RuntimeError(error)
2025-12-04T12:52:45.8337954Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8338001Z Traceback (most recent call last):
2025-12-04T12:52:45.8338211Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8338254Z     getattr(self, test_name)()
2025-12-04T12:52:45.8338413Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8338447Z     fn()
2025-12-04T12:52:45.8338598Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8338639Z     method(*args, **kwargs)
2025-12-04T12:52:45.8338788Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8338829Z     method(*args, **kwargs)
2025-12-04T12:52:45.8338978Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8339016Z     with policy():
2025-12-04T12:52:45.8339169Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8339210Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8339594Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2466250752 and is now 3196059648.
2025-12-04T12:52:45.8339598Z 
2025-12-04T12:52:45.8339675Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8339944Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8339947Z 
2025-12-04T12:52:45.8340036Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8340038Z 
2025-12-04T12:52:45.8340040Z 
2025-12-04T12:52:45.8340115Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8340202Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8340435Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-681fc1044bc94934.xml -
2025-12-04T12:52:45.8340494Z =========================== short test summary info ============================
2025-12-04T12:52:45.8340788Z FAILED [9.0130s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8340836Z Traceback (most recent call last):
2025-12-04T12:52:45.8341001Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8341043Z     getattr(self, test_name)()
2025-12-04T12:52:45.8341201Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8341247Z     fn()
2025-12-04T12:52:45.8341399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8341440Z     method(*args, **kwargs)
2025-12-04T12:52:45.8341592Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8341633Z     method(*args, **kwargs)
2025-12-04T12:52:45.8341782Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8341845Z     with policy():
2025-12-04T12:52:45.8341995Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8342035Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8342423Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2466250752 and is now 3196059648.
2025-12-04T12:52:45.8342425Z 
2025-12-04T12:52:45.8342500Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8342766Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8342770Z 
2025-12-04T12:52:45.8342857Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8342922Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8342983Z ======================= 1 failed, 9 deselected in 9.02s ========================
2025-12-04T12:52:45.8343020Z Got exit code 1
2025-12-04T12:52:45.8343059Z Retrying single test...
2025-12-04T12:52:45.8343248Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-9dd64414167d3313.xml
2025-12-04T12:52:45.8343306Z ============================= test session starts ==============================
2025-12-04T12:52:45.8343419Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8343459Z cachedir: .pytest_cache
2025-12-04T12:52:45.8343618Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8343665Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8343707Z configfile: pytest.ini
2025-12-04T12:52:45.8343869Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8343943Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8344205Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8344249Z Running 1 items in this shard
2025-12-04T12:52:45.8344252Z 
2025-12-04T12:52:45.8344598Z distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda I1204 12:51:26.607000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 505170
2025-12-04T12:52:45.8344756Z I1204 12:51:26.608000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 505171
2025-12-04T12:52:45.8344910Z I1204 12:51:26.609000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 505172
2025-12-04T12:52:45.8345060Z I1204 12:51:26.609000 505101 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 505173
2025-12-04T12:52:45.8345566Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8345628Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8346128Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8346198Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8346682Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8346740Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8347221Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8347279Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8347423Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8347585Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8347876Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8348031Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8348365Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8348491Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8348783Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8348931Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8349207Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8349356Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8349641Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8349779Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8350056Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8352932Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8353463Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8353582Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8353782Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8354185Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8354303Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8354515Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8354681Z [rank3]:E1204 12:51:33.549000 505173 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T12:52:45.8354722Z dist init r=3, world=4
2025-12-04T12:52:45.8354864Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8355023Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8355313Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8355468Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8355781Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8355907Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8356184Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8356333Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8356619Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8356766Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8357041Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8357188Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8357485Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8357633Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8358198Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8358315Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8358511Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8358907Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8359020Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8359231Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8359394Z [rank1]:E1204 12:51:33.573000 505171 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8359435Z dist init r=1, world=4
2025-12-04T12:52:45.8359572Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8359732Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8360018Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8360187Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8360474Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8360600Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8360888Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8361035Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8361311Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8361469Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8361757Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8361894Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8362168Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8362317Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8362827Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 2. CUDA driver allocated memory was 2300575744 and is now 3036676096.
2025-12-04T12:52:45.8362944Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8363139Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8363534Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8363650Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8363860Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8364024Z [rank2]:E1204 12:51:33.596000 505172 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T12:52:45.8364062Z dist init r=2, world=4
2025-12-04T12:52:45.8364199Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8364366Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8364651Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8364806Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8365099Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8365224Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8365500Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8365647Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8365941Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8366088Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8366363Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8366499Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8366774Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8366923Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8367434Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8367549Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8367744Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8368139Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8368294Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8368503Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8368680Z [rank0]:E1204 12:51:33.687000 505170 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8368719Z dist init r=0, world=4
2025-12-04T12:52:45.8369056Z [rank0]:[W1204 12:51:33.583180533 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8369097Z FAILED [9.0136s] [100%]
2025-12-04T12:52:45.8369100Z 
2025-12-04T12:52:45.8369157Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8369306Z _ TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda _
2025-12-04T12:52:45.8369354Z Traceback (most recent call last):
2025-12-04T12:52:45.8369517Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8369562Z     self._join_processes(fn)
2025-12-04T12:52:45.8369734Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8369803Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8369999Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8370043Z     raise RuntimeError(error)
2025-12-04T12:52:45.8370125Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8370170Z Traceback (most recent call last):
2025-12-04T12:52:45.8370331Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8370374Z     getattr(self, test_name)()
2025-12-04T12:52:45.8370532Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8370568Z     fn()
2025-12-04T12:52:45.8370719Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8370762Z     method(*args, **kwargs)
2025-12-04T12:52:45.8370913Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8370954Z     method(*args, **kwargs)
2025-12-04T12:52:45.8371103Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8371140Z     with policy():
2025-12-04T12:52:45.8371292Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8371333Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8371721Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8371726Z 
2025-12-04T12:52:45.8371801Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8372071Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8372073Z 
2025-12-04T12:52:45.8372162Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8372165Z 
2025-12-04T12:52:45.8372226Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8372270Z Traceback (most recent call last):
2025-12-04T12:52:45.8372433Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8372484Z     getattr(self, test_name)()
2025-12-04T12:52:45.8372644Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8372679Z     fn()
2025-12-04T12:52:45.8372832Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8372871Z     method(*args, **kwargs)
2025-12-04T12:52:45.8373021Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8373060Z     method(*args, **kwargs)
2025-12-04T12:52:45.8373220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8373257Z     with policy():
2025-12-04T12:52:45.8373409Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8373451Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8373834Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8373866Z 
2025-12-04T12:52:45.8373940Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8374208Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8374209Z 
2025-12-04T12:52:45.8374298Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8374300Z 
2025-12-04T12:52:45.8374358Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8374405Z Traceback (most recent call last):
2025-12-04T12:52:45.8374567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8374610Z     getattr(self, test_name)()
2025-12-04T12:52:45.8374769Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8374804Z     fn()
2025-12-04T12:52:45.8374954Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8374993Z     method(*args, **kwargs)
2025-12-04T12:52:45.8375145Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8375184Z     method(*args, **kwargs)
2025-12-04T12:52:45.8375333Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8375370Z     with policy():
2025-12-04T12:52:45.8375521Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8375563Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8375946Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8375949Z 
2025-12-04T12:52:45.8376021Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8376287Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8376289Z 
2025-12-04T12:52:45.8376384Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8376387Z 
2025-12-04T12:52:45.8376389Z 
2025-12-04T12:52:45.8376467Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8376555Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8376789Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-9dd64414167d3313.xml -
2025-12-04T12:52:45.8376850Z =========================== short test summary info ============================
2025-12-04T12:52:45.8377144Z FAILED [9.0136s] distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8377191Z Traceback (most recent call last):
2025-12-04T12:52:45.8377354Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8377396Z     getattr(self, test_name)()
2025-12-04T12:52:45.8377565Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8377609Z     fn()
2025-12-04T12:52:45.8377759Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8377799Z     method(*args, **kwargs)
2025-12-04T12:52:45.8377948Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8377988Z     method(*args, **kwargs)
2025-12-04T12:52:45.8378136Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8378216Z     with policy():
2025-12-04T12:52:45.8378366Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8378407Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8378790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 0. CUDA driver allocated memory was 2459959296 and is now 3196059648.
2025-12-04T12:52:45.8378794Z 
2025-12-04T12:52:45.8378867Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8379137Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8379139Z 
2025-12-04T12:52:45.8379226Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8379228Z 
2025-12-04T12:52:45.8379287Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8379331Z Traceback (most recent call last):
2025-12-04T12:52:45.8379495Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8379537Z     getattr(self, test_name)()
2025-12-04T12:52:45.8379696Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8379729Z     fn()
2025-12-04T12:52:45.8379881Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8379921Z     method(*args, **kwargs)
2025-12-04T12:52:45.8380070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8380126Z     method(*args, **kwargs)
2025-12-04T12:52:45.8380275Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8380313Z     with policy():
2025-12-04T12:52:45.8380464Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8380506Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8380902Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 1. CUDA driver allocated memory was 2317352960 and is now 3053453312.
2025-12-04T12:52:45.8380904Z 
2025-12-04T12:52:45.8380977Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8381243Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8381244Z 
2025-12-04T12:52:45.8381331Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8381347Z 
2025-12-04T12:52:45.8381417Z Process 3 exited with error code 10 and exception:
2025-12-04T12:52:45.8381462Z Traceback (most recent call last):
2025-12-04T12:52:45.8381624Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8381666Z     getattr(self, test_name)()
2025-12-04T12:52:45.8381825Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8381858Z     fn()
2025-12-04T12:52:45.8382007Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8382046Z     method(*args, **kwargs)
2025-12-04T12:52:45.8382195Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8382234Z     method(*args, **kwargs)
2025-12-04T12:52:45.8382383Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8382419Z     with policy():
2025-12-04T12:52:45.8382569Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8382609Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8382991Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda! Caching allocator allocated memory was 512 and is now reported as 4608 on device 3. CUDA driver allocated memory was 2250244096 and is now 2986344448.
2025-12-04T12:52:45.8382993Z 
2025-12-04T12:52:45.8383065Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8383330Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestCommunicationCUDA.test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8383333Z 
2025-12-04T12:52:45.8383422Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8383485Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8383548Z ======================= 1 failed, 9 deselected in 9.02s ========================
2025-12-04T12:52:45.8383585Z Got exit code 1
2025-12-04T12:52:45.8383806Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda
2025-12-04T12:52:45.8383944Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.8384135Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-4c47fa7ebdbb029a.xml
2025-12-04T12:52:45.8384194Z ============================= test session starts ==============================
2025-12-04T12:52:45.8384308Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8384349Z cachedir: .pytest_cache
2025-12-04T12:52:45.8384506Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8384553Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8384603Z configfile: pytest.ini
2025-12-04T12:52:45.8384765Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8384839Z collecting ... collected 10 items / 8 deselected / 2 selected
2025-12-04T12:52:45.8384892Z stepcurrent: skipping 8 already run items.
2025-12-04T12:52:45.8384936Z Running 2 items in this shard
2025-12-04T12:52:45.8384938Z 
2025-12-04T12:52:45.8385253Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda I1204 12:51:37.919000 505503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 505572
2025-12-04T12:52:45.8385418Z I1204 12:51:37.920000 505503 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 505573
2025-12-04T12:52:45.8385917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8385980Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8386469Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8386531Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8387612Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8387739Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8388861Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8388986Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8389129Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8389314Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8389605Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8389759Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8391710Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8391851Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8392129Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8392278Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8392555Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8392704Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8392979Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8393116Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8393394Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8393541Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8394020Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8394136Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8394332Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8394699Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8394816Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8395030Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8395203Z [rank1]:E1204 12:51:44.699000 505573 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8395244Z dist init r=1, world=2
2025-12-04T12:52:45.8395382Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8395542Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8395827Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8396003Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8396287Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8396412Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8396687Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8396833Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8397109Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8397257Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8397533Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8397669Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8397945Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8398095Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8398621Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8398757Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8398952Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8399309Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8399423Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8399647Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8399814Z [rank0]:E1204 12:51:44.702000 505572 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8399853Z dist init r=0, world=2
2025-12-04T12:52:45.8400188Z [rank0]:[W1204 12:51:44.562634456 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8400251Z FAILED [8.6135s] [ 50%]
2025-12-04T12:52:45.8400254Z 
2025-12-04T12:52:45.8400309Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8400408Z ____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda _____
2025-12-04T12:52:45.8400455Z Traceback (most recent call last):
2025-12-04T12:52:45.8400618Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8400661Z     self._join_processes(fn)
2025-12-04T12:52:45.8400833Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8400888Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8401064Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8401109Z     raise RuntimeError(error)
2025-12-04T12:52:45.8401189Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8401234Z Traceback (most recent call last):
2025-12-04T12:52:45.8401395Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8401436Z     getattr(self, test_name)()
2025-12-04T12:52:45.8401593Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8401627Z     fn()
2025-12-04T12:52:45.8401780Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8401821Z     method(*args, **kwargs)
2025-12-04T12:52:45.8401973Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8402013Z     method(*args, **kwargs)
2025-12-04T12:52:45.8402164Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8402200Z     with policy():
2025-12-04T12:52:45.8402352Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8402392Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8402751Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8402754Z 
2025-12-04T12:52:45.8402828Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8403059Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8403062Z 
2025-12-04T12:52:45.8403150Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8403152Z 
2025-12-04T12:52:45.8403210Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8403267Z Traceback (most recent call last):
2025-12-04T12:52:45.8403429Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8403472Z     getattr(self, test_name)()
2025-12-04T12:52:45.8403630Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8403665Z     fn()
2025-12-04T12:52:45.8403814Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8403879Z     method(*args, **kwargs)
2025-12-04T12:52:45.8404028Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8404068Z     method(*args, **kwargs)
2025-12-04T12:52:45.8404218Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8404256Z     with policy():
2025-12-04T12:52:45.8404406Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8404447Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8404792Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8404796Z 
2025-12-04T12:52:45.8404869Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8405098Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8405100Z 
2025-12-04T12:52:45.8405187Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8405189Z 
2025-12-04T12:52:45.8405191Z 
2025-12-04T12:52:45.8405266Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8405352Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8405587Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-4c47fa7ebdbb029a.xml -
2025-12-04T12:52:45.8405647Z =========================== short test summary info ============================
2025-12-04T12:52:45.8405892Z FAILED [8.6135s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8405939Z Traceback (most recent call last):
2025-12-04T12:52:45.8406102Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8406144Z     getattr(self, test_name)()
2025-12-04T12:52:45.8406303Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8406336Z     fn()
2025-12-04T12:52:45.8406506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8406547Z     method(*args, **kwargs)
2025-12-04T12:52:45.8406698Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8406738Z     method(*args, **kwargs)
2025-12-04T12:52:45.8406886Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8406926Z     with policy():
2025-12-04T12:52:45.8407086Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8407128Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8407476Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8407478Z 
2025-12-04T12:52:45.8407551Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8407789Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8407802Z 
2025-12-04T12:52:45.8407890Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8407893Z 
2025-12-04T12:52:45.8407950Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8407999Z Traceback (most recent call last):
2025-12-04T12:52:45.8408187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8408231Z     getattr(self, test_name)()
2025-12-04T12:52:45.8408391Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8408425Z     fn()
2025-12-04T12:52:45.8408575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8408616Z     method(*args, **kwargs)
2025-12-04T12:52:45.8408765Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8408803Z     method(*args, **kwargs)
2025-12-04T12:52:45.8408952Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8408989Z     with policy():
2025-12-04T12:52:45.8409140Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8409180Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8409526Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8409530Z 
2025-12-04T12:52:45.8409602Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8409829Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8409832Z 
2025-12-04T12:52:45.8409917Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8409982Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8410044Z ======================= 1 failed, 8 deselected in 8.62s ========================
2025-12-04T12:52:45.8410081Z Got exit code 1
2025-12-04T12:52:45.8410121Z Retrying single test...
2025-12-04T12:52:45.8410326Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a73d7424feda7a29.xml
2025-12-04T12:52:45.8410386Z ============================= test session starts ==============================
2025-12-04T12:52:45.8410499Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8410540Z cachedir: .pytest_cache
2025-12-04T12:52:45.8410696Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8410742Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8410797Z configfile: pytest.ini
2025-12-04T12:52:45.8410961Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8411033Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8411254Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8411313Z Running 1 items in this shard
2025-12-04T12:52:45.8411315Z 
2025-12-04T12:52:45.8411631Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda I1204 12:51:48.849000 505739 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 505808
2025-12-04T12:52:45.8411784Z I1204 12:51:48.850000 505739 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 505809
2025-12-04T12:52:45.8412281Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8412343Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8412829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8412890Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8413969Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8414094Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8415160Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8415284Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8415436Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8415598Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8415889Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8416056Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8416353Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8416479Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8416756Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8416904Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8417179Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8417327Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8417602Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8417737Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8418014Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8418220Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8418700Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8418814Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8419035Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8419392Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8419508Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8419719Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8419894Z [rank0]:E1204 12:51:55.607000 505808 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8419934Z dist init r=0, world=2
2025-12-04T12:52:45.8420072Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8420230Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8420530Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8420707Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8420992Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8421117Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8421392Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8421540Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8421814Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8421962Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8422235Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8422371Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8422646Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8422795Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8423285Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8423399Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8423595Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8423951Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8424078Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8424289Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8424454Z [rank1]:E1204 12:51:55.610000 505809 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8424493Z dist init r=1, world=2
2025-12-04T12:52:45.8424840Z [rank0]:[W1204 12:51:55.448252263 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8424892Z FAILED [8.5120s] [100%]
2025-12-04T12:52:45.8424894Z 
2025-12-04T12:52:45.8424949Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8425050Z ____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda _____
2025-12-04T12:52:45.8425096Z Traceback (most recent call last):
2025-12-04T12:52:45.8425260Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8425303Z     self._join_processes(fn)
2025-12-04T12:52:45.8425477Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8425530Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8425709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8425752Z     raise RuntimeError(error)
2025-12-04T12:52:45.8425832Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8425877Z Traceback (most recent call last):
2025-12-04T12:52:45.8426039Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8426080Z     getattr(self, test_name)()
2025-12-04T12:52:45.8426239Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8426274Z     fn()
2025-12-04T12:52:45.8426424Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8426465Z     method(*args, **kwargs)
2025-12-04T12:52:45.8426615Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8426656Z     method(*args, **kwargs)
2025-12-04T12:52:45.8426806Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8426843Z     with policy():
2025-12-04T12:52:45.8426994Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8427035Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8427396Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8427399Z 
2025-12-04T12:52:45.8427474Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8427704Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8427706Z 
2025-12-04T12:52:45.8427794Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8427796Z 
2025-12-04T12:52:45.8427808Z 
2025-12-04T12:52:45.8427884Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8427970Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8428238Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-a73d7424feda7a29.xml -
2025-12-04T12:52:45.8428298Z =========================== short test summary info ============================
2025-12-04T12:52:45.8428558Z FAILED [8.5120s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8428619Z Traceback (most recent call last):
2025-12-04T12:52:45.8428782Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8428824Z     getattr(self, test_name)()
2025-12-04T12:52:45.8428983Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8429017Z     fn()
2025-12-04T12:52:45.8429169Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8429208Z     method(*args, **kwargs)
2025-12-04T12:52:45.8429360Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8429400Z     method(*args, **kwargs)
2025-12-04T12:52:45.8429550Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8429586Z     with policy():
2025-12-04T12:52:45.8429738Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8429778Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8430130Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8430133Z 
2025-12-04T12:52:45.8430206Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8430433Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8430436Z 
2025-12-04T12:52:45.8430523Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8430585Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8430647Z ======================= 1 failed, 9 deselected in 8.52s ========================
2025-12-04T12:52:45.8430685Z Got exit code 1
2025-12-04T12:52:45.8430725Z Retrying single test...
2025-12-04T12:52:45.8430913Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-8df6fab4d75749e9.xml
2025-12-04T12:52:45.8430984Z ============================= test session starts ==============================
2025-12-04T12:52:45.8431095Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8431137Z cachedir: .pytest_cache
2025-12-04T12:52:45.8431294Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8431341Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8431380Z configfile: pytest.ini
2025-12-04T12:52:45.8431542Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8431629Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8431850Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8431894Z Running 1 items in this shard
2025-12-04T12:52:45.8431897Z 
2025-12-04T12:52:45.8432200Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda I1204 12:51:59.754000 505975 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506044
2025-12-04T12:52:45.8432377Z I1204 12:51:59.755000 505975 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506045
2025-12-04T12:52:45.8432871Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8432933Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8433419Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8433483Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8434557Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8434682Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8435750Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8435874Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8436017Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8436182Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8436484Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8436638Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8436925Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8437078Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8437357Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8437504Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8437780Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8437927Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8438238Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8438374Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8438652Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8438801Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8439273Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8439390Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8439586Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8439964Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8440078Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8440290Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8440454Z [rank0]:E1204 12:52:06.877000 506044 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8440493Z dist init r=0, world=2
2025-12-04T12:52:45.8440645Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8440804Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8441093Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8441273Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8441555Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8441682Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8441957Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8442104Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8442380Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8442527Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8442802Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8442937Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8443216Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8443366Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8443840Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8443953Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8444162Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8444517Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8444631Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8444851Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8445015Z [rank1]:E1204 12:52:06.895000 506045 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8445054Z dist init r=1, world=2
2025-12-04T12:52:45.8445391Z [rank0]:[W1204 12:52:07.971638389 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8445449Z FAILED [9.4118s] [100%]
2025-12-04T12:52:45.8445451Z 
2025-12-04T12:52:45.8445507Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8445607Z ____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda _____
2025-12-04T12:52:45.8445653Z Traceback (most recent call last):
2025-12-04T12:52:45.8445815Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8445859Z     self._join_processes(fn)
2025-12-04T12:52:45.8446031Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8446087Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8446263Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8446308Z     raise RuntimeError(error)
2025-12-04T12:52:45.8446387Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8446432Z Traceback (most recent call last):
2025-12-04T12:52:45.8446591Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8446634Z     getattr(self, test_name)()
2025-12-04T12:52:45.8446790Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8446825Z     fn()
2025-12-04T12:52:45.8446975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8447016Z     method(*args, **kwargs)
2025-12-04T12:52:45.8447165Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8447206Z     method(*args, **kwargs)
2025-12-04T12:52:45.8447356Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8447393Z     with policy():
2025-12-04T12:52:45.8447544Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8447584Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8447932Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8447946Z 
2025-12-04T12:52:45.8448021Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8448294Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8448298Z 
2025-12-04T12:52:45.8448386Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8448388Z 
2025-12-04T12:52:45.8448390Z 
2025-12-04T12:52:45.8448464Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8448565Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8448799Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-8df6fab4d75749e9.xml -
2025-12-04T12:52:45.8448859Z =========================== short test summary info ============================
2025-12-04T12:52:45.8449104Z FAILED [9.4118s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8449163Z Traceback (most recent call last):
2025-12-04T12:52:45.8449339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8449381Z     getattr(self, test_name)()
2025-12-04T12:52:45.8449541Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8449576Z     fn()
2025-12-04T12:52:45.8449727Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8449767Z     method(*args, **kwargs)
2025-12-04T12:52:45.8449918Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8449958Z     method(*args, **kwargs)
2025-12-04T12:52:45.8450107Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8450145Z     with policy():
2025-12-04T12:52:45.8450295Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8450336Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8450683Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda! Caching allocator allocated memory was 512 and is now reported as 13824 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8450685Z 
2025-12-04T12:52:45.8450759Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8450987Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8450990Z 
2025-12-04T12:52:45.8451077Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8451141Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8451201Z ======================= 1 failed, 9 deselected in 9.42s ========================
2025-12-04T12:52:45.8451238Z Got exit code 1
2025-12-04T12:52:45.8451414Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda
2025-12-04T12:52:45.8451542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.8451730Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-7ec29cbcb4eb0a53.xml
2025-12-04T12:52:45.8451801Z ============================= test session starts ==============================
2025-12-04T12:52:45.8451912Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8451955Z cachedir: .pytest_cache
2025-12-04T12:52:45.8452112Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8452157Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8452197Z configfile: pytest.ini
2025-12-04T12:52:45.8452376Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8452449Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8452502Z stepcurrent: skipping 9 already run items.
2025-12-04T12:52:45.8452545Z Running 1 items in this shard
2025-12-04T12:52:45.8452547Z 
2025-12-04T12:52:45.8452852Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda I1204 12:52:11.471000 506211 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506280
2025-12-04T12:52:45.8453019Z I1204 12:52:11.472000 506211 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506281
2025-12-04T12:52:45.8453525Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8453587Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8454075Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8454137Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8455207Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8455331Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8456403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8456528Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8456672Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8456834Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8457133Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8457288Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8457573Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8457721Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8457997Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8458197Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8458473Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8458619Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8458899Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8459036Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8459313Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8459462Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8459933Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928.
2025-12-04T12:52:45.8460049Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8460245Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8460620Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8460734Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8460946Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8461112Z [rank0]:E1204 12:52:18.315000 506280 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8461151Z dist init r=0, world=2
2025-12-04T12:52:45.8461301Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8461459Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8461746Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8461899Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8462207Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8462331Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8462608Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8462755Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8463029Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8463177Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8463454Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8463590Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8463867Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8464014Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8464485Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8464598Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8464804Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8465156Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8465271Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8465481Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8465655Z [rank1]:E1204 12:52:18.380000 506281 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8465694Z dist init r=1, world=2
2025-12-04T12:52:45.8466033Z [rank0]:[W1204 12:52:18.165898650 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8466083Z FAILED [8.6119s] [100%]
2025-12-04T12:52:45.8466085Z 
2025-12-04T12:52:45.8466149Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8466248Z _____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda _____
2025-12-04T12:52:45.8466295Z Traceback (most recent call last):
2025-12-04T12:52:45.8466456Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8466502Z     self._join_processes(fn)
2025-12-04T12:52:45.8466675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8466727Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8466905Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8466948Z     raise RuntimeError(error)
2025-12-04T12:52:45.8467028Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8467074Z Traceback (most recent call last):
2025-12-04T12:52:45.8467236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8467280Z     getattr(self, test_name)()
2025-12-04T12:52:45.8467437Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8467472Z     fn()
2025-12-04T12:52:45.8467623Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8467663Z     method(*args, **kwargs)
2025-12-04T12:52:45.8467814Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8467854Z     method(*args, **kwargs)
2025-12-04T12:52:45.8468003Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8468041Z     with policy():
2025-12-04T12:52:45.8468235Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8468276Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8468620Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928.
2025-12-04T12:52:45.8468623Z 
2025-12-04T12:52:45.8468726Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8468953Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8468958Z 
2025-12-04T12:52:45.8469046Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8469049Z 
2025-12-04T12:52:45.8469050Z 
2025-12-04T12:52:45.8469125Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8469211Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8469466Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-7ec29cbcb4eb0a53.xml -
2025-12-04T12:52:45.8469526Z =========================== short test summary info ============================
2025-12-04T12:52:45.8469768Z FAILED [8.6119s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8469813Z Traceback (most recent call last):
2025-12-04T12:52:45.8469992Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8470046Z     getattr(self, test_name)()
2025-12-04T12:52:45.8470205Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8470239Z     fn()
2025-12-04T12:52:45.8470392Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8470431Z     method(*args, **kwargs)
2025-12-04T12:52:45.8470583Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8470622Z     method(*args, **kwargs)
2025-12-04T12:52:45.8470772Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8470809Z     with policy():
2025-12-04T12:52:45.8470960Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8471002Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8471349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928.
2025-12-04T12:52:45.8471351Z 
2025-12-04T12:52:45.8471425Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8471652Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8471654Z 
2025-12-04T12:52:45.8471741Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8471804Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8471867Z ======================= 1 failed, 9 deselected in 8.62s ========================
2025-12-04T12:52:45.8471903Z Got exit code 1
2025-12-04T12:52:45.8471944Z Retrying single test...
2025-12-04T12:52:45.8472131Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e214515fa2a46151.xml
2025-12-04T12:52:45.8472190Z ============================= test session starts ==============================
2025-12-04T12:52:45.8472300Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8472341Z cachedir: .pytest_cache
2025-12-04T12:52:45.8472514Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8472565Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8472606Z configfile: pytest.ini
2025-12-04T12:52:45.8472768Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8472841Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8473059Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8473103Z Running 1 items in this shard
2025-12-04T12:52:45.8473114Z 
2025-12-04T12:52:45.8473414Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda I1204 12:52:22.769000 506447 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506516
2025-12-04T12:52:45.8473569Z I1204 12:52:22.770000 506447 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506517
2025-12-04T12:52:45.8474062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8474146Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8474635Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8474694Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8475769Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8475896Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8476958Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8477092Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8477236Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8477398Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8477690Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8477853Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8478140Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8478293Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8478585Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8478745Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8479021Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8479168Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8479441Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8479579Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8479859Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8480009Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8480484Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928.
2025-12-04T12:52:45.8480598Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8480794Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8481147Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8481260Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8481483Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8481648Z [rank0]:E1204 12:52:29.808000 506516 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8481688Z dist init r=0, world=2
2025-12-04T12:52:45.8481826Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8481986Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8482286Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8482440Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8482723Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8482872Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8483147Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8483294Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8483568Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8483715Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8483990Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8484124Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8484401Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8484550Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8485021Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8485137Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8485331Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8485696Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8485809Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8486022Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8486185Z [rank1]:E1204 12:52:29.890000 506517 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8486224Z dist init r=1, world=2
2025-12-04T12:52:45.8486567Z [rank0]:[W1204 12:52:29.648773767 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8486608Z FAILED [8.8113s] [100%]
2025-12-04T12:52:45.8486611Z 
2025-12-04T12:52:45.8486667Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8486765Z _____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda _____
2025-12-04T12:52:45.8486830Z Traceback (most recent call last):
2025-12-04T12:52:45.8486992Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8487035Z     self._join_processes(fn)
2025-12-04T12:52:45.8487207Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8487261Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8487437Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8487482Z     raise RuntimeError(error)
2025-12-04T12:52:45.8487561Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8487606Z Traceback (most recent call last):
2025-12-04T12:52:45.8487766Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8487811Z     getattr(self, test_name)()
2025-12-04T12:52:45.8487967Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8488002Z     fn()
2025-12-04T12:52:45.8488195Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8488236Z     method(*args, **kwargs)
2025-12-04T12:52:45.8488387Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8488428Z     method(*args, **kwargs)
2025-12-04T12:52:45.8488578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8488615Z     with policy():
2025-12-04T12:52:45.8488766Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8488808Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8489155Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928.
2025-12-04T12:52:45.8489158Z 
2025-12-04T12:52:45.8489233Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8489460Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8489487Z 
2025-12-04T12:52:45.8489574Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8489577Z 
2025-12-04T12:52:45.8489579Z 
2025-12-04T12:52:45.8489654Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8489742Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8489973Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-e214515fa2a46151.xml -
2025-12-04T12:52:45.8490032Z =========================== short test summary info ============================
2025-12-04T12:52:45.8490288Z FAILED [8.8113s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8490335Z Traceback (most recent call last):
2025-12-04T12:52:45.8490498Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8490541Z     getattr(self, test_name)()
2025-12-04T12:52:45.8490711Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8490763Z     fn()
2025-12-04T12:52:45.8490914Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8490956Z     method(*args, **kwargs)
2025-12-04T12:52:45.8491106Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8491146Z     method(*args, **kwargs)
2025-12-04T12:52:45.8491297Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8491336Z     with policy():
2025-12-04T12:52:45.8491489Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8491530Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8491877Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2021654528 and is now 3489660928.
2025-12-04T12:52:45.8491880Z 
2025-12-04T12:52:45.8491956Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8492183Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8492187Z 
2025-12-04T12:52:45.8492272Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8492336Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8492397Z ======================= 1 failed, 9 deselected in 8.82s ========================
2025-12-04T12:52:45.8492436Z Got exit code 1
2025-12-04T12:52:45.8492475Z Retrying single test...
2025-12-04T12:52:45.8492664Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-794d127c8ae76675.xml
2025-12-04T12:52:45.8492721Z ============================= test session starts ==============================
2025-12-04T12:52:45.8492833Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8492874Z cachedir: .pytest_cache
2025-12-04T12:52:45.8493032Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8493077Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8493117Z configfile: pytest.ini
2025-12-04T12:52:45.8493289Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8493362Z collecting ... collected 10 items / 9 deselected / 1 selected
2025-12-04T12:52:45.8493583Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8493629Z Running 1 items in this shard
2025-12-04T12:52:45.8493631Z 
2025-12-04T12:52:45.8493943Z distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda I1204 12:52:34.186000 506683 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 506752
2025-12-04T12:52:45.8494098Z I1204 12:52:34.187000 506683 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 506753
2025-12-04T12:52:45.8494593Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8494675Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8495163Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T12:52:45.8495223Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T12:52:45.8496300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8496426Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8497480Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: c10d::allreduce_: an autograd kernel was not registered to the Autograd key(s) but we are trying to backprop through it. This may lead to silently incorrect behavior. This behavior is deprecated and will be removed in a future version of PyTorch. If your operator is differentiable, please ensure you have registered an autograd kernel to the correct Autograd key (e.g. DispatchKey::Autograd, DispatchKey::CompositeImplicitAutograd). If your operator is not differentiable, or to squash this warning and use the previous behavior, please register torch::CppFunction::makeFallthrough() to DispatchKey::Autograd. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:76.)
2025-12-04T12:52:45.8497606Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T12:52:45.8497750Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8497920Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8498245Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8498401Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8498699Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8498824Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8499101Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8499268Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8499556Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8499705Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8499979Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8500117Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8500394Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8500543Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8501018Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8501133Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8501329Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8501684Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8501798Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8502009Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8502185Z [rank1]:E1204 12:52:41.051000 506753 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T12:52:45.8502225Z dist init r=1, world=2
2025-12-04T12:52:45.8502362Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T12:52:45.8502522Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T12:52:45.8502806Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8502973Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T12:52:45.8503259Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8503384Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T12:52:45.8503681Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8503827Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8504101Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8504248Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T12:52:45.8504522Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8504658Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T12:52:45.8504935Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8505084Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T12:52:45.8505555Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8505672Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8505867Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8506220Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8506343Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T12:52:45.8506554Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8506720Z [rank0]:E1204 12:52:41.053000 506752 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T12:52:45.8506759Z dist init r=0, world=2
2025-12-04T12:52:45.8507107Z [rank0]:[W1204 12:52:41.936098727 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T12:52:45.8507146Z FAILED [8.7109s] [100%]
2025-12-04T12:52:45.8507148Z 
2025-12-04T12:52:45.8507203Z =================================== FAILURES ===================================
2025-12-04T12:52:45.8507304Z _____ TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda _____
2025-12-04T12:52:45.8507350Z Traceback (most recent call last):
2025-12-04T12:52:45.8507510Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T12:52:45.8507572Z     self._join_processes(fn)
2025-12-04T12:52:45.8507744Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T12:52:45.8507798Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T12:52:45.8507976Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T12:52:45.8508020Z     raise RuntimeError(error)
2025-12-04T12:52:45.8508099Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8508182Z Traceback (most recent call last):
2025-12-04T12:52:45.8508343Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8508385Z     getattr(self, test_name)()
2025-12-04T12:52:45.8508544Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8508580Z     fn()
2025-12-04T12:52:45.8508731Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8508770Z     method(*args, **kwargs)
2025-12-04T12:52:45.8508921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8508962Z     method(*args, **kwargs)
2025-12-04T12:52:45.8509111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8509147Z     with policy():
2025-12-04T12:52:45.8509299Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8509339Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8509684Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8509687Z 
2025-12-04T12:52:45.8509762Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8509990Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8509992Z 
2025-12-04T12:52:45.8510080Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8510082Z 
2025-12-04T12:52:45.8510156Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8510203Z Traceback (most recent call last):
2025-12-04T12:52:45.8510364Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8510407Z     getattr(self, test_name)()
2025-12-04T12:52:45.8510566Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8510601Z     fn()
2025-12-04T12:52:45.8510751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8510791Z     method(*args, **kwargs)
2025-12-04T12:52:45.8510954Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8510994Z     method(*args, **kwargs)
2025-12-04T12:52:45.8511143Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8511180Z     with policy():
2025-12-04T12:52:45.8511329Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8511397Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8511739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8511741Z 
2025-12-04T12:52:45.8511816Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8512041Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8512044Z 
2025-12-04T12:52:45.8512131Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8512133Z 
2025-12-04T12:52:45.8512134Z 
2025-12-04T12:52:45.8512210Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T12:52:45.8512298Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T12:52:45.8512531Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-794d127c8ae76675.xml -
2025-12-04T12:52:45.8512591Z =========================== short test summary info ============================
2025-12-04T12:52:45.8512833Z FAILED [8.7109s] distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T12:52:45.8512878Z Traceback (most recent call last):
2025-12-04T12:52:45.8513043Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8513085Z     getattr(self, test_name)()
2025-12-04T12:52:45.8513244Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8513280Z     fn()
2025-12-04T12:52:45.8513430Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8513472Z     method(*args, **kwargs)
2025-12-04T12:52:45.8513621Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8513661Z     method(*args, **kwargs)
2025-12-04T12:52:45.8513809Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8513846Z     with policy():
2025-12-04T12:52:45.8514008Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8514049Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8514393Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 0. CUDA driver allocated memory was 2019557376 and is now 3489660928.
2025-12-04T12:52:45.8514397Z 
2025-12-04T12:52:45.8514471Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8514711Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8514713Z 
2025-12-04T12:52:45.8514802Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8514805Z 
2025-12-04T12:52:45.8514863Z Process 1 exited with error code 10 and exception:
2025-12-04T12:52:45.8514909Z Traceback (most recent call last):
2025-12-04T12:52:45.8515073Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T12:52:45.8515126Z     getattr(self, test_name)()
2025-12-04T12:52:45.8515296Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T12:52:45.8515330Z     fn()
2025-12-04T12:52:45.8515480Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8515519Z     method(*args, **kwargs)
2025-12-04T12:52:45.8515669Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T12:52:45.8515707Z     method(*args, **kwargs)
2025-12-04T12:52:45.8515856Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T12:52:45.8515893Z     with policy():
2025-12-04T12:52:45.8516043Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T12:52:45.8516084Z     raise RuntimeError(msg)
2025-12-04T12:52:45.8516430Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda! Caching allocator allocated memory was 512 and is now reported as 9216 on device 1. CUDA driver allocated memory was 1864368128 and is now 3334471680.
2025-12-04T12:52:45.8516433Z 
2025-12-04T12:52:45.8516505Z To execute this test, run the following from the base repo dir:
2025-12-04T12:52:45.8516731Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_comm.py TestExplicitUnshardCUDA.test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8516733Z 
2025-12-04T12:52:45.8516820Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T12:52:45.8516885Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T12:52:45.8516947Z ======================= 1 failed, 9 deselected in 8.72s ========================
2025-12-04T12:52:45.8516985Z Got exit code 1
2025-12-04T12:52:45.8517163Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda
2025-12-04T12:52:45.8517291Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T12:52:45.8517479Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-16355a924f1fdd43.xml
2025-12-04T12:52:45.8517537Z ============================= test session starts ==============================
2025-12-04T12:52:45.8517648Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T12:52:45.8517704Z cachedir: .pytest_cache
2025-12-04T12:52:45.8517862Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T12:52:45.8517908Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T12:52:45.8517952Z configfile: pytest.ini
2025-12-04T12:52:45.8518112Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T12:52:45.8518226Z collecting ... collected 10 items / 10 deselected / 0 selected
2025-12-04T12:52:45.8518280Z stepcurrent: skipping 10 already run items.
2025-12-04T12:52:45.8518324Z Running 0 items in this shard
2025-12-04T12:52:45.8518345Z 
2025-12-04T12:52:45.8518577Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_comm/distributed.fsdp.test_fsdp_comm-16355a924f1fdd43.xml -
2025-12-04T12:52:45.8518637Z ============================ 10 deselected in 0.00s ============================
2025-12-04T12:52:45.8520533Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_False_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_False_use_no_sync_True_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_False_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy0_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestCommunicationCUDA::test_communication_nested_model_True_use_no_sync_True_sharding_strategy1_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_False_cuda', 'test/distributed/fsdp/test_fsdp_comm.py::TestExplicitUnshardCUDA::test_unshard_async_use_orig_params_True_cuda']
2025-12-04T12:52:45.8520564Z 
2025-12-04T12:52:45.8520748Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_comm 1/1 (test/test-reports/distributed.fsdp.test_fsdp_comm_1.1_3b36b42e6bf366b5_.log)
2025-12-04T12:52:45.8520750Z 
2025-12-04T12:52:45.8520871Z Finished distributed/fsdp/test_fsdp_comm 1/1 ... [2025-12-04 12:52:45.735130][2292264.38431054], took 5.86min
2025-12-04T12:52:45.8521133Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:52:45.8521220Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:52:45.8521316Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T12:52:45.8521364Z Uploading artifacts took 0.00 seconds
2025-12-04T12:52:45.8521418Z distributed/fsdp/test_fsdp_comm 1/1 failed!
2025-12-04T12:52:45.8521517Z Running distributed/test_c10d_pypg 1/1 ... [2025-12-04 12:52:45.738585][2292264.387768717]
2025-12-04T12:52:45.8521566Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:52:45.8521872Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_pypg.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:52:45.738767]
2025-12-04T12:52:53.0138946Z 
2025-12-04T12:52:53.0140916Z distributed/test_c10d_pypg 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_pypg_1.1_55969333d4e99855_.log
2025-12-04T12:52:53.0156648Z Running 48 items in this shard: test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_no_init_sync, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkSubclass::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_dataclass_output_unused_param, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_twice_weight_sharing, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_invoke_work_object, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_no_init_sync, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_ddp_with_pypg_with_grad_views, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_invalid_powerSGD_state, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_empty_input, test/distributed/test_c10d_pypg.py::TestDDPWithWorkWrapper::test_sync_batch_norm_only_empty_input, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_abort_shutdown, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_attr_overrides, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_block_current_stream, test/distributed/test_c10d_pypg.py::TestPyProcessGroup::test_block_current_stream_use_after_free
2025-12-04T12:52:53.0166793Z 
2025-12-04T12:52:53.0166942Z Finished distributed/test_c10d_pypg 1/1 ... [2025-12-04 12:52:53.013480][2292271.662659355], took 0.12min
2025-12-04T12:52:53.0167478Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:52:53.0175019Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:52:53.0177900Z Running distributed/test_pg_wrapper 1/1 ... [2025-12-04 12:52:53.017679][2292271.666862421]
2025-12-04T12:52:53.0178110Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:52:53.0179993Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_pg_wrapper.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:52:53.017867]
2025-12-04T12:54:30.9402113Z 
2025-12-04T12:54:30.9403404Z distributed/test_pg_wrapper 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_pg_wrapper_1.1_148baf11d75dd18e_.log
2025-12-04T12:54:30.9410309Z Running 17 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_coalescing_manager_debug_mode_detail, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_hang, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_detail, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_off, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_debug_level_detail_no_gloo, test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_new_group_no_gloo, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_hang, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode_off, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda_debug_mode, test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_debug_mode
2025-12-04T12:54:30.9416197Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_coalescing_manager_debug_mode_detail
2025-12-04T12:54:30.9416983Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_hang
2025-12-04T12:54:30.9417712Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_detail
2025-12-04T12:54:30.9418519Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collective_shape_mismatch_debug_mode_off
2025-12-04T12:54:30.9419061Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch
2025-12-04T12:54:30.9419717Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_collectives_op_mismatch_debug_mode
2025-12-04T12:54:30.9420244Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_debug_level_detail_no_gloo
2025-12-04T12:54:30.9420742Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupNCCLWrapperTest::test_new_group_no_gloo
2025-12-04T12:54:30.9421214Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_hang
2025-12-04T12:54:30.9421845Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda
2025-12-04T12:54:30.9422401Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_cuda_debug_mode
2025-12-04T12:54:30.9422965Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode
2025-12-04T12:54:30.9423533Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collective_shape_mismatch_debug_mode_off
2025-12-04T12:54:30.9424078Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch
2025-12-04T12:54:30.9424593Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda
2025-12-04T12:54:30.9425143Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_cuda_debug_mode
2025-12-04T12:54:30.9425692Z Running 1 items in this shard: test/distributed/test_pg_wrapper.py::ProcessGroupGlooWrapperTest::test_collectives_op_mismatch_debug_mode
2025-12-04T12:54:30.9425989Z 
2025-12-04T12:54:30.9426167Z Finished distributed/test_pg_wrapper 1/1 ... [2025-12-04 12:54:30.940123][2292369.589302929], took 1.63min
2025-12-04T12:54:30.9426770Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:54:30.9432808Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:54:30.9436242Z Running distributed/tensor/test_utils 1/1 ... [2025-12-04 12:54:30.943527][2292369.592711054]
2025-12-04T12:54:30.9436536Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:54:30.9437992Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/tensor/test_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:54:30.943713]
2025-12-04T12:55:34.2038837Z 
2025-12-04T12:55:34.2042500Z distributed/tensor/test_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.tensor.test_utils_1.1_aedfc19f01c5775f_.log
2025-12-04T12:55:34.2048218Z Running 24 items in this shard: test/distributed/tensor/test_utils.py::LocalTest::test_compute_local_shape_and_global_offset_uneven, test/distributed/tensor/test_utils.py::UtilTest::test_compute_global_tensor_shape_1D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_global_tensor_shape_1D_invalid_shape, test/distributed/tensor/test_utils.py::UtilTest::test_compute_global_tensor_shape_failure_2D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_1D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_2D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_3D, test/distributed/tensor/test_utils.py::UtilTest::test_compute_local_shape_and_global_offset_4D, test/distributed/tensor/test_utils.py::UtilTest::test_fsdp_tp_meta_compute, test/distributed/tensor/test_utils.py::UtilTest::test_hsdp_tp_meta_compute, test/distributed/tensor/test_utils.py::UtilTest::test_uneven_fsdp_tp_meta_compute, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_global_tensor_info_non_shard_placements, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_global_tensor_info_shard_placement, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_global_tensor_info_unsupported_placement, test/distributed/tensor/test_utils.py::UtilSingleDeviceTest::test_compute_tensor_info, test/distributed/tensor/test_utils.py::TestStridedSharding::test_1d_mesh_strided_sharding, test/distributed/tensor/test_utils.py::TestStridedSharding::test_2d_mesh_2d_tensor_strided_sharding, test/distributed/tensor/test_utils.py::TestStridedSharding::test_2d_mesh_strided_sharding, test/distributed/tensor/test_utils.py::TestStridedSharding::test_2d_mesh_uneven_strided_shard, test/distributed/tensor/test_utils.py::Test_StridedShard_with_shard_order::test_StridedShard_not_convertible_to_shard_order, test/distributed/tensor/test_utils.py::Test_StridedShard_with_shard_order::test_StridedShard_to_shard_order, test/distributed/tensor/test_utils.py::Test2DStridedLocalShard::test_fsdp1_tp_2d_dtensor_local_shards_and_offsets, test/distributed/tensor/test_utils.py::Test2DStridedLocalShard::test_fsdp2_tp_2d_dtensor_local_shards_and_offsets, test/distributed/tensor/test_utils.py::TestExplicitRedistribute::test_explicit_matmul
2025-12-04T12:55:34.2053162Z 
2025-12-04T12:55:34.2053319Z Finished distributed/tensor/test_utils 1/1 ... [2025-12-04 12:55:34.203631][2292432.852811298], took 1.05min
2025-12-04T12:55:34.2054975Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:55:34.2071131Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:55:34.2074200Z Running distributed/fsdp/test_fsdp_unshard_params 1/1 ... [2025-12-04 12:55:34.207349][2292432.856532648]
2025-12-04T12:55:34.2074431Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:55:34.2076413Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_unshard_params.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:55:34.207530]
2025-12-04T12:56:39.8231544Z 
2025-12-04T12:56:39.8233011Z distributed/fsdp/test_fsdp_unshard_params 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_unshard_params_1.1_339d2f7e4cf208e0_.log
2025-12-04T12:56:39.8240801Z Running 15 items in this shard: test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_named_parameters_and_buffers, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_param_data, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_recurse, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_respects_reshard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_params_writeback, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_singleton_param_writeback, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_unshard_submodule, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_with_grads_core, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParams::test_with_grads_none_grads, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsNoShard::test_unshard_params_param_data_no_shard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsNoShard::test_unshard_params_writeback_no_shard, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_offload_to_cpu_no_shard_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_rank0_only_with_writeback_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_unshard_params_from_backward_raises, test/distributed/fsdp/test_fsdp_unshard_params.py::TestUnshardParamsErrors::test_unshard_params_from_forward_raises
2025-12-04T12:56:39.8246373Z 
2025-12-04T12:56:39.8252576Z Finished distributed/fsdp/test_fsdp_unshard_params 1/1 ... [2025-12-04 12:56:39.822765][2292498.471944791], took 1.09min
2025-12-04T12:56:39.8253893Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:56:39.8268578Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:56:39.8271146Z Running distributed/checkpoint/test_state_dict_utils 1/1 ... [2025-12-04 12:56:39.826980][2292498.476163733]
2025-12-04T12:56:39.8271516Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:56:39.8273070Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/checkpoint/test_state_dict_utils.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:56:39.827156]
2025-12-04T12:57:14.9447883Z 
2025-12-04T12:57:14.9449283Z distributed/checkpoint/test_state_dict_utils 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.checkpoint.test_state_dict_utils_1.1_b968ab5788bde42f_.log
2025-12-04T12:57:14.9452128Z Running 7 items in this shard: test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_complicated_dict, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_cpu_and_ranks_only, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_cpu_offload_for_dtensor, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_create_cpu_state_dict, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_gather_state_dict_dtensor, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_gather_with_cpu_and_ranks_only, test/distributed/checkpoint/test_state_dict_utils.py::TestStateDictUtils::test_state_dict_util_distribute_tensors
2025-12-04T12:57:14.9454309Z 
2025-12-04T12:57:14.9454596Z Finished distributed/checkpoint/test_state_dict_utils 1/1 ... [2025-12-04 12:57:14.944476][2292533.593655859], took 0.59min
2025-12-04T12:57:14.9467953Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:57:14.9486448Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:57:14.9487624Z Running distributed/_shard/sharded_tensor/ops/test_init 1/1 ... [2025-12-04 12:57:14.948606][2292533.597789934]
2025-12-04T12:57:14.9487942Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:57:14.9490919Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_init.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:14.948799]
2025-12-04T12:57:31.9881897Z 
2025-12-04T12:57:31.9883172Z distributed/_shard/sharded_tensor/ops/test_init 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_init_1.1_103df0e7967870d8_.log
2025-12-04T12:57:31.9884689Z Running 3 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_kaiming_uniform, test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_normal, test/distributed/_shard/sharded_tensor/ops/test_init.py::TestShardedTensorNNInit::test_init_sharded_tensor_with_uniform
2025-12-04T12:57:31.9885681Z 
2025-12-04T12:57:31.9885928Z Finished distributed/_shard/sharded_tensor/ops/test_init 1/1 ... [2025-12-04 12:57:31.987876][2292550.637055577], took 0.28min
2025-12-04T12:57:31.9902619Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:57:31.9917672Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:57:31.9920337Z Running distributed/_shard/sharded_tensor/ops/test_embedding 1/1 ... [2025-12-04 12:57:31.991868][2292550.641051445]
2025-12-04T12:57:31.9920802Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:57:31.9922839Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:31.992054]
2025-12-04T12:57:44.7764390Z 
2025-12-04T12:57:44.7765757Z distributed/_shard/sharded_tensor/ops/test_embedding 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_embedding_1.1_0331a6abc537409d_.log
2025-12-04T12:57:44.7767436Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_embedding.py::TestShardedEmbedding::test_sharded_embedding_colwise, test/distributed/_shard/sharded_tensor/ops/test_embedding.py::TestShardedEmbedding::test_sharded_embedding_rowwise
2025-12-04T12:57:44.7768649Z 
2025-12-04T12:57:44.7769024Z Finished distributed/_shard/sharded_tensor/ops/test_embedding 1/1 ... [2025-12-04 12:57:44.776143][2292563.425322493], took 0.21min
2025-12-04T12:57:44.7786086Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:57:44.7801675Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:57:44.7804843Z Running distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 ... [2025-12-04 12:57:44.780319][2292563.429502837]
2025-12-04T12:57:44.7805257Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:57:44.7806326Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/ops/test_embedding_bag.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:44.780501]
2025-12-04T12:57:57.4637269Z 
2025-12-04T12:57:57.4638735Z distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.ops.test_embedding_bag_1.1_878df039e5b1d3c0_.log
2025-12-04T12:57:57.4640165Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/ops/test_embedding_bag.py::TestShardedEmbeddingBag::test_sharded_embedding_bag_colwise, test/distributed/_shard/sharded_tensor/ops/test_embedding_bag.py::TestShardedEmbeddingBag::test_sharded_embedding_bag_rowwise
2025-12-04T12:57:57.4641092Z 
2025-12-04T12:57:57.4642029Z Finished distributed/_shard/sharded_tensor/ops/test_embedding_bag 1/1 ... [2025-12-04 12:57:57.463458][2292576.112637949], took 0.21min
2025-12-04T12:57:57.4658215Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:57:57.4674803Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:57:57.4678752Z Running distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2025-12-04 12:57:57.467635][2292576.116818674]
2025-12-04T12:57:57.4679397Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:57:57.4680488Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:57:57.467823]
2025-12-04T12:58:09.5999100Z 
2025-12-04T12:58:09.6001354Z distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed._shard.sharded_tensor.test_sharded_tensor_reshard_1.1_2ef61b254586826d_.log
2025-12-04T12:58:09.6003648Z Running 2 items in this shard: test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard, test/distributed/_shard/sharded_tensor/test_sharded_tensor_reshard.py::TestReshard::test_sharded_tensor_reshard_errors
2025-12-04T12:58:09.6004699Z 
2025-12-04T12:58:09.6005168Z Finished distributed/_shard/sharded_tensor/test_sharded_tensor_reshard 1/1 ... [2025-12-04 12:58:09.599546][2292588.248725117], took 0.20min
2025-12-04T12:58:09.6020728Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T12:58:09.6037665Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T12:58:09.6041164Z Running distributed/fsdp/test_fsdp_core 1/3 ... [2025-12-04 12:58:09.603951][2292588.253134818]
2025-12-04T12:58:09.6041533Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T12:58:09.6042751Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_core.py', '--shard-id=1', '--num-shards=3', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 12:58:09.604140]
2025-12-04T13:21:31.3005767Z 
2025-12-04T13:21:31.3015002Z PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 1/3 (test/test-reports/distributed.fsdp.test_fsdp_core_1.3_b5bdac945a318f3b_.log)
2025-12-04T13:21:31.3015708Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f6a1b3360576c80.xml
2025-12-04T13:21:31.3016122Z ============================= test session starts ==============================
2025-12-04T13:21:31.3016413Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3016669Z cachedir: .pytest_cache
2025-12-04T13:21:31.3016960Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3017276Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3017441Z configfile: pytest.ini
2025-12-04T13:21:31.3017732Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3018041Z collecting ... collected 60 items
2025-12-04T13:21:31.3018275Z stepcurrent: Cannot find last run test, not skipping
2025-12-04T13:21:31.3022917Z Running 19 items in this shard: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda, test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda, test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda, test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.3026571Z 
2025-12-04T13:21:31.3026897Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 12:58:11.346000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 529357
2025-12-04T13:21:31.3027410Z I1204 12:58:11.346000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 529358
2025-12-04T13:21:31.3027761Z I1204 12:58:11.347000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 529359
2025-12-04T13:21:31.3028109Z I1204 12:58:11.348000 529288 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 529360
2025-12-04T13:21:31.3028724Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3029180Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3029779Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3030396Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3030898Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3031347Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3031949Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3032549Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3033010Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3033456Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3033915Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3034385Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3034954Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3035541Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3036127Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3036709Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3038097Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3039585Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3041071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3042509Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3043955Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3045429Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3046862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3048325Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3048630Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3048973Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3049492Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3049993Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3050519Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3050982Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3051466Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3051948Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3052425Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3052887Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3053352Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3053866Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3054399Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3054912Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3055646Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136.
2025-12-04T13:21:31.3056283Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3056692Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3071705Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3072401Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3072799Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3073253Z [rank2]:E1204 12:58:18.644000 529359 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3073509Z dist init r=2, world=4
2025-12-04T13:21:31.3073736Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3074108Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3074639Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3075133Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3075670Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3076133Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3076592Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3077093Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3077567Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3078038Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3078631Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3079192Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3079663Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3080138Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3080818Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232.
2025-12-04T13:21:31.3081469Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3081838Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3082523Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3083038Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3083411Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3083886Z [rank0]:E1204 12:58:18.645000 529357 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3084140Z dist init r=0, world=4
2025-12-04T13:21:31.3084354Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3084703Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3085223Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3085755Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3086315Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3086810Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3087343Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3087883Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3089845Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3090360Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3090875Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3091408Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3091867Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3092338Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3093010Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488.
2025-12-04T13:21:31.3093637Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3094040Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3094692Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3095212Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3095676Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3096107Z [rank3]:E1204 12:58:18.649000 529360 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3096357Z dist init r=3, world=4
2025-12-04T13:21:31.3096569Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3096956Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3097526Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3098013Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3098550Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3099018Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3099467Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3099983Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3100459Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3100928Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3101391Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3101843Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3102294Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3102760Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3103420Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352.
2025-12-04T13:21:31.3104041Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3104390Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3104983Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3105485Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3105848Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3106284Z [rank1]:E1204 12:58:18.712000 529358 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3106523Z dist init r=1, world=4
2025-12-04T13:21:31.3106939Z [rank0]:[W1204 12:58:18.472218625 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3107350Z FAILED [9.2138s] [  5%]
2025-12-04T13:21:31.3107416Z 
2025-12-04T13:21:31.3107479Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3107672Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____
2025-12-04T13:21:31.3107852Z Traceback (most recent call last):
2025-12-04T13:21:31.3108100Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3108388Z     self._join_processes(fn)
2025-12-04T13:21:31.3108634Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3108919Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3109203Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3109477Z     raise RuntimeError(error)
2025-12-04T13:21:31.3109630Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3109792Z Traceback (most recent call last):
2025-12-04T13:21:31.3110032Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3110273Z     getattr(self, test_name)()
2025-12-04T13:21:31.3110509Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3110745Z     fn()
2025-12-04T13:21:31.3110956Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3111201Z     method(*args, **kwargs)
2025-12-04T13:21:31.3111433Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3111667Z     method(*args, **kwargs)
2025-12-04T13:21:31.3111884Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3112112Z     with policy():
2025-12-04T13:21:31.3112326Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3112562Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3112986Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136.
2025-12-04T13:21:31.3113371Z 
2025-12-04T13:21:31.3113445Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3113790Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3114061Z 
2025-12-04T13:21:31.3114151Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3114276Z 
2025-12-04T13:21:31.3114278Z 
2025-12-04T13:21:31.3114357Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3114566Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3114950Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f6a1b3360576c80.xml -
2025-12-04T13:21:31.3115292Z =========================== short test summary info ============================
2025-12-04T13:21:31.3115647Z FAILED [9.2138s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3115982Z Traceback (most recent call last):
2025-12-04T13:21:31.3116238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3116495Z     getattr(self, test_name)()
2025-12-04T13:21:31.3116737Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3116979Z     fn()
2025-12-04T13:21:31.3117190Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3117434Z     method(*args, **kwargs)
2025-12-04T13:21:31.3117664Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3117933Z     method(*args, **kwargs)
2025-12-04T13:21:31.3118193Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3118443Z     with policy():
2025-12-04T13:21:31.3118653Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3118886Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3119314Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136.
2025-12-04T13:21:31.3119701Z 
2025-12-04T13:21:31.3119776Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3120120Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3120390Z 
2025-12-04T13:21:31.3120480Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3120666Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3120827Z ============================== 1 failed in 9.35s ===============================
2025-12-04T13:21:31.3120964Z Got exit code 1
2025-12-04T13:21:31.3121071Z Retrying single test...
2025-12-04T13:21:31.3121333Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0680ba892e5781e1.xml
2025-12-04T13:21:31.3121621Z ============================= test session starts ==============================
2025-12-04T13:21:31.3121837Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3122030Z cachedir: .pytest_cache
2025-12-04T13:21:31.3122262Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3122507Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3122630Z configfile: pytest.ini
2025-12-04T13:21:31.3122862Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3123142Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3123478Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3123780Z Running 1 items in this shard
2025-12-04T13:21:31.3123851Z 
2025-12-04T13:21:31.3124180Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 12:58:23.038000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 529759
2025-12-04T13:21:31.3124682Z I1204 12:58:23.039000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 529760
2025-12-04T13:21:31.3125028Z I1204 12:58:23.039000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 529761
2025-12-04T13:21:31.3125374Z I1204 12:58:23.040000 529690 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 529762
2025-12-04T13:21:31.3125932Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3126377Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3126833Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3127304Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3127887Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3128525Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3129112Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3129695Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3130155Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3130601Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3131182Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3131769Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3132222Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3132661Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3133255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3133845Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3135233Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3136679Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3138130Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3139599Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3141042Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3142462Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3143905Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3145324Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3145635Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3145989Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3146503Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3147011Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3147511Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3147969Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3148468Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3148949Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3149426Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3149899Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3150363Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3150814Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3151276Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3151751Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3152424Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232.
2025-12-04T13:21:31.3153055Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3153429Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3154029Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3154551Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3154923Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3155349Z [rank0]:E1204 12:58:30.310000 529759 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3155600Z dist init r=0, world=4
2025-12-04T13:21:31.3155815Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3156178Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3156701Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3157188Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3157677Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3158134Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3158624Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3159097Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3159568Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3160039Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3160508Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3160961Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3161417Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3161894Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3162580Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352.
2025-12-04T13:21:31.3163211Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3163570Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3164168Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3164681Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3165055Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3165497Z [rank1]:E1204 12:58:30.311000 529760 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3165762Z dist init r=1, world=4
2025-12-04T13:21:31.3165999Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3166344Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3166839Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3167328Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3167813Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3168314Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3168768Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3169241Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3169714Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3170186Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3170655Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3171118Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3171576Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3172039Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3172726Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488.
2025-12-04T13:21:31.3173345Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3173695Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3174283Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3174788Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3175193Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3175619Z [rank3]:E1204 12:58:30.319000 529762 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3175858Z dist init r=3, world=4
2025-12-04T13:21:31.3176064Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3176402Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3176887Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3177367Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3177845Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3178332Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3178770Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3179232Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3179695Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3180163Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3180624Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3181075Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3181543Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3182011Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3182673Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136.
2025-12-04T13:21:31.3183291Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3183642Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3184242Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3184777Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3185140Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3185553Z [rank2]:E1204 12:58:30.319000 529761 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3185792Z dist init r=2, world=4
2025-12-04T13:21:31.3186195Z [rank0]:[W1204 12:58:30.157725631 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3186606Z FAILED [9.1131s] [100%]
2025-12-04T13:21:31.3186673Z 
2025-12-04T13:21:31.3186731Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3186922Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____
2025-12-04T13:21:31.3187101Z Traceback (most recent call last):
2025-12-04T13:21:31.3187346Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3187590Z     self._join_processes(fn)
2025-12-04T13:21:31.3187835Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3188101Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3188421Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3188681Z     raise RuntimeError(error)
2025-12-04T13:21:31.3188834Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3188996Z Traceback (most recent call last):
2025-12-04T13:21:31.3189237Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3189479Z     getattr(self, test_name)()
2025-12-04T13:21:31.3189710Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3189941Z     fn()
2025-12-04T13:21:31.3190143Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3190375Z     method(*args, **kwargs)
2025-12-04T13:21:31.3190613Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3190845Z     method(*args, **kwargs)
2025-12-04T13:21:31.3191064Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3191292Z     with policy():
2025-12-04T13:21:31.3191503Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3191736Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3192155Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232.
2025-12-04T13:21:31.3192539Z 
2025-12-04T13:21:31.3192615Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3192969Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3193252Z 
2025-12-04T13:21:31.3193357Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3193481Z 
2025-12-04T13:21:31.3193483Z 
2025-12-04T13:21:31.3193563Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3193765Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3194120Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0680ba892e5781e1.xml -
2025-12-04T13:21:31.3194447Z =========================== short test summary info ============================
2025-12-04T13:21:31.3194791Z FAILED [9.1131s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3195117Z Traceback (most recent call last):
2025-12-04T13:21:31.3195363Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3195606Z     getattr(self, test_name)()
2025-12-04T13:21:31.3195837Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3196067Z     fn()
2025-12-04T13:21:31.3196270Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3196499Z     method(*args, **kwargs)
2025-12-04T13:21:31.3196716Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3196946Z     method(*args, **kwargs)
2025-12-04T13:21:31.3197163Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3197391Z     with policy():
2025-12-04T13:21:31.3197602Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3197836Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3198299Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232.
2025-12-04T13:21:31.3198679Z 
2025-12-04T13:21:31.3198755Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3199110Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3199376Z 
2025-12-04T13:21:31.3199464Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3199651Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3199817Z ======================= 1 failed, 18 deselected in 9.25s =======================
2025-12-04T13:21:31.3199956Z Got exit code 1
2025-12-04T13:21:31.3200056Z Retrying single test...
2025-12-04T13:21:31.3200309Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ec3b60899159e6a.xml
2025-12-04T13:21:31.3200596Z ============================= test session starts ==============================
2025-12-04T13:21:31.3200808Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3200999Z cachedir: .pytest_cache
2025-12-04T13:21:31.3201225Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3201462Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3201625Z configfile: pytest.ini
2025-12-04T13:21:31.3201851Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3202145Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3202474Z stepcurrent: skipping 0 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3202769Z Running 1 items in this shard
2025-12-04T13:21:31.3202846Z 
2025-12-04T13:21:31.3203152Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda I1204 12:58:34.484000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 530161
2025-12-04T13:21:31.3203640Z I1204 12:58:34.485000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 530162
2025-12-04T13:21:31.3203989Z I1204 12:58:34.485000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 530163
2025-12-04T13:21:31.3204335Z I1204 12:58:34.486000 530092 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 530164
2025-12-04T13:21:31.3204883Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3205323Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3205906Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3206491Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3206944Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3207380Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3207968Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3208588Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3209037Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3209482Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3210047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3210628Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3211090Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3211550Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3212123Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3212702Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3214069Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3215488Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3216932Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3218398Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3219829Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3221252Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3222687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3224092Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3224395Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3224739Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3225235Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3225716Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3226195Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3226644Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3227085Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3227551Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3228027Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3228532Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3228993Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3229443Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3229896Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3230387Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3231062Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488.
2025-12-04T13:21:31.3231700Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3232053Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3232646Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3233152Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3233517Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3233930Z [rank3]:E1204 12:58:42.013000 530164 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3234169Z dist init r=3, world=4
2025-12-04T13:21:31.3234372Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3234709Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3235197Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3235676Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3236152Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3236596Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3237051Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3237515Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3237979Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3238480Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3238943Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3239392Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3239863Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3240354Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3241015Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3602907136.
2025-12-04T13:21:31.3241635Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3241983Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3242570Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3243082Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3243446Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3243862Z [rank2]:E1204 12:58:42.015000 530163 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3244105Z dist init r=2, world=4
2025-12-04T13:21:31.3244308Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3244647Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3245132Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3245609Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3246108Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3246556Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3246999Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3247461Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3247922Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3248430Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3248909Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3249388Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3249843Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3250305Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3250966Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3755999232.
2025-12-04T13:21:31.3251587Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3251939Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3252526Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3253027Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3253390Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3253801Z [rank0]:E1204 12:58:42.070000 530161 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3254042Z dist init r=0, world=4
2025-12-04T13:21:31.3254243Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3254577Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3255059Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3255549Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3256026Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3256475Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3256916Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3257377Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3257845Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3258372Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3258848Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3259296Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3259747Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3260213Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3260872Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3619684352.
2025-12-04T13:21:31.3261492Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3261839Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3262424Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3262927Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3263290Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3263704Z [rank1]:E1204 12:58:42.106000 530162 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3263944Z dist init r=1, world=4
2025-12-04T13:21:31.3264343Z [rank0]:[W1204 12:58:42.015866417 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3264764Z FAILED [9.4145s] [100%]
2025-12-04T13:21:31.3264828Z 
2025-12-04T13:21:31.3264888Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3265082Z ___ TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda ____
2025-12-04T13:21:31.3265261Z Traceback (most recent call last):
2025-12-04T13:21:31.3265505Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3265749Z     self._join_processes(fn)
2025-12-04T13:21:31.3265993Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3266258Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3266526Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3266785Z     raise RuntimeError(error)
2025-12-04T13:21:31.3266942Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3267104Z Traceback (most recent call last):
2025-12-04T13:21:31.3267376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3267635Z     getattr(self, test_name)()
2025-12-04T13:21:31.3267865Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3268096Z     fn()
2025-12-04T13:21:31.3268332Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3268563Z     method(*args, **kwargs)
2025-12-04T13:21:31.3268783Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3269014Z     method(*args, **kwargs)
2025-12-04T13:21:31.3269297Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3269527Z     with policy():
2025-12-04T13:21:31.3269740Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3269974Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3270391Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488.
2025-12-04T13:21:31.3270772Z 
2025-12-04T13:21:31.3270849Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3271188Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3271452Z 
2025-12-04T13:21:31.3271543Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3271670Z 
2025-12-04T13:21:31.3271672Z 
2025-12-04T13:21:31.3271752Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3271955Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3272311Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ec3b60899159e6a.xml -
2025-12-04T13:21:31.3272641Z =========================== short test summary info ============================
2025-12-04T13:21:31.3272987Z FAILED [9.4145s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3273335Z Traceback (most recent call last):
2025-12-04T13:21:31.3273581Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3273827Z     getattr(self, test_name)()
2025-12-04T13:21:31.3274058Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3274290Z     fn()
2025-12-04T13:21:31.3274491Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3274720Z     method(*args, **kwargs)
2025-12-04T13:21:31.3274940Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3275167Z     method(*args, **kwargs)
2025-12-04T13:21:31.3275384Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3275613Z     with policy():
2025-12-04T13:21:31.3275824Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3276090Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3276509Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3552575488.
2025-12-04T13:21:31.3276916Z 
2025-12-04T13:21:31.3276992Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3277333Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3277597Z 
2025-12-04T13:21:31.3277686Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3277874Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3278042Z ======================= 1 failed, 18 deselected in 9.55s =======================
2025-12-04T13:21:31.3278248Z Got exit code 1
2025-12-04T13:21:31.3278486Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda
2025-12-04T13:21:31.3278821Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.3279174Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dbdb1831962e97ea.xml
2025-12-04T13:21:31.3279457Z ============================= test session starts ==============================
2025-12-04T13:21:31.3279671Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3279899Z cachedir: .pytest_cache
2025-12-04T13:21:31.3280125Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3280366Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3280485Z configfile: pytest.ini
2025-12-04T13:21:31.3280712Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3280985Z collecting ... collected 60 items / 1 deselected / 59 selected
2025-12-04T13:21:31.3281148Z stepcurrent: skipping 1 already run items.
2025-12-04T13:21:31.3281279Z Running 18 items in this shard
2025-12-04T13:21:31.3281353Z 
2025-12-04T13:21:31.3281661Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:58:46.474000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 530563
2025-12-04T13:21:31.3282173Z I1204 12:58:46.475000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 530564
2025-12-04T13:21:31.3282523Z I1204 12:58:46.476000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 530565
2025-12-04T13:21:31.3282863Z I1204 12:58:46.476000 530494 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 530566
2025-12-04T13:21:31.3283412Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3283851Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3284286Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3284720Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3285309Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3285926Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3286509Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3287091Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3287547Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3287989Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3288594Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3289175Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3289625Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3290063Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3290631Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3291211Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3291450Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3291811Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3292305Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3292787Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3293264Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3293716Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3294172Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3294652Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3295136Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3295598Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3296064Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3296519Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3296977Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3297445Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3298111Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240.
2025-12-04T13:21:31.3298771Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3299124Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3299717Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3300224Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3300591Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3301023Z [rank1]:E1204 12:58:53.957000 530564 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3301267Z dist init r=1, world=4
2025-12-04T13:21:31.3301471Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3301811Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3302304Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3302784Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3303265Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3303725Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3304198Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3304665Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3305129Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3305596Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3330563Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3331080Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3331554Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3332033Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3332714Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376.
2025-12-04T13:21:31.3333358Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3333723Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3334324Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3334891Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3335270Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3335696Z [rank3]:E1204 12:58:53.957000 530566 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3335946Z dist init r=3, world=4
2025-12-04T13:21:31.3336161Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3336507Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3337005Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3337491Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3338005Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3338525Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3338972Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3339443Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3339917Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3340388Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3340856Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3341314Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3341777Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3342250Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3342922Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024.
2025-12-04T13:21:31.3343555Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3343911Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3344520Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3345031Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3345404Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3345826Z [rank2]:E1204 12:58:53.968000 530565 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3346070Z dist init r=2, world=4
2025-12-04T13:21:31.3346277Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3346619Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3347128Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3347647Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3348130Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3348628Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3349077Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3349554Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3350027Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3350494Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3350958Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3351413Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3351875Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3352352Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3353018Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3353662Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3354016Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3354606Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3355115Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3355485Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3355908Z [rank0]:E1204 12:58:53.969000 530563 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3356154Z dist init r=0, world=4
2025-12-04T13:21:31.3356581Z [rank0]:[W1204 12:58:54.844170899 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3357026Z FAILED [9.4156s] [  5%]
2025-12-04T13:21:31.3357092Z 
2025-12-04T13:21:31.3357158Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3357354Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____
2025-12-04T13:21:31.3357537Z Traceback (most recent call last):
2025-12-04T13:21:31.3357791Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3358042Z     self._join_processes(fn)
2025-12-04T13:21:31.3358353Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3358626Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3358906Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3359170Z     raise RuntimeError(error)
2025-12-04T13:21:31.3359328Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3359496Z Traceback (most recent call last):
2025-12-04T13:21:31.3359745Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3359993Z     getattr(self, test_name)()
2025-12-04T13:21:31.3360230Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3360466Z     fn()
2025-12-04T13:21:31.3360675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3360912Z     method(*args, **kwargs)
2025-12-04T13:21:31.3361144Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3361379Z     method(*args, **kwargs)
2025-12-04T13:21:31.3361602Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3361836Z     with policy():
2025-12-04T13:21:31.3362054Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3362292Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3362736Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240.
2025-12-04T13:21:31.3363119Z 
2025-12-04T13:21:31.3363202Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3363547Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3363813Z 
2025-12-04T13:21:31.3363909Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3364036Z 
2025-12-04T13:21:31.3364305Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3364452Z Traceback (most recent call last):
2025-12-04T13:21:31.3364702Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3364949Z     getattr(self, test_name)()
2025-12-04T13:21:31.3365187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3365425Z     fn()
2025-12-04T13:21:31.3365648Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3365923Z     method(*args, **kwargs)
2025-12-04T13:21:31.3366147Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3366383Z     method(*args, **kwargs)
2025-12-04T13:21:31.3366606Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3366839Z     with policy():
2025-12-04T13:21:31.3367056Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3367293Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3367721Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376.
2025-12-04T13:21:31.3368107Z 
2025-12-04T13:21:31.3368222Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3368567Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3368833Z 
2025-12-04T13:21:31.3368923Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3369053Z 
2025-12-04T13:21:31.3369055Z 
2025-12-04T13:21:31.3369135Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3369340Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3369703Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-dbdb1831962e97ea.xml -
2025-12-04T13:21:31.3370037Z =========================== short test summary info ============================
2025-12-04T13:21:31.3370386Z FAILED [9.4156s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3370712Z Traceback (most recent call last):
2025-12-04T13:21:31.3370966Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3371217Z     getattr(self, test_name)()
2025-12-04T13:21:31.3371455Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3371692Z     fn()
2025-12-04T13:21:31.3371914Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3372151Z     method(*args, **kwargs)
2025-12-04T13:21:31.3372378Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3372613Z     method(*args, **kwargs)
2025-12-04T13:21:31.3372836Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3373067Z     with policy():
2025-12-04T13:21:31.3373285Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3373519Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3373943Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240.
2025-12-04T13:21:31.3374329Z 
2025-12-04T13:21:31.3374425Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3374782Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3375059Z 
2025-12-04T13:21:31.3375154Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3375279Z 
2025-12-04T13:21:31.3375340Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3375483Z Traceback (most recent call last):
2025-12-04T13:21:31.3375732Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3375978Z     getattr(self, test_name)()
2025-12-04T13:21:31.3376212Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3376448Z     fn()
2025-12-04T13:21:31.3376656Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3376893Z     method(*args, **kwargs)
2025-12-04T13:21:31.3377116Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3377349Z     method(*args, **kwargs)
2025-12-04T13:21:31.3377571Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3377800Z     with policy():
2025-12-04T13:21:31.3378015Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3378294Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3378718Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376.
2025-12-04T13:21:31.3379101Z 
2025-12-04T13:21:31.3379182Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3379522Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3379783Z 
2025-12-04T13:21:31.3379876Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3380068Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3380238Z ======================= 1 failed, 1 deselected in 9.56s ========================
2025-12-04T13:21:31.3380382Z Got exit code 1
2025-12-04T13:21:31.3380485Z Retrying single test...
2025-12-04T13:21:31.3380766Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4bd3df182e000d22.xml
2025-12-04T13:21:31.3381059Z ============================= test session starts ==============================
2025-12-04T13:21:31.3381279Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3381471Z cachedir: .pytest_cache
2025-12-04T13:21:31.3381701Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3381943Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3382068Z configfile: pytest.ini
2025-12-04T13:21:31.3382303Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3382582Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3382918Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3383249Z Running 1 items in this shard
2025-12-04T13:21:31.3383328Z 
2025-12-04T13:21:31.3383648Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:58:58.235000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 530965
2025-12-04T13:21:31.3384144Z I1204 12:58:58.235000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 530966
2025-12-04T13:21:31.3384492Z I1204 12:58:58.236000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 530967
2025-12-04T13:21:31.3384836Z I1204 12:58:58.237000 530896 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 530968
2025-12-04T13:21:31.3385393Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3385840Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3386423Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3387021Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3387476Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3387916Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3388520Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3389104Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3389552Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3390003Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3390569Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3391150Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3391596Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3392033Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3392623Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3393229Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3393467Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3393810Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3394298Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3394777Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3395257Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3395706Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3396146Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3396609Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3397072Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3397536Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3397997Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3398494Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3398949Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3399426Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3400089Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3400711Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3401057Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3401643Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3402187Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3402565Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3402979Z [rank0]:E1204 12:59:06.113000 530965 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3403218Z dist init r=0, world=4
2025-12-04T13:21:31.3403421Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3403758Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3404246Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3404726Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3405201Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3405649Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3406087Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3406550Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3407012Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3407473Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3407934Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3408433Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3408890Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3409356Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3410016Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376.
2025-12-04T13:21:31.3410632Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3410980Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3411595Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3412110Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3412471Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3412882Z [rank3]:E1204 12:59:06.127000 530968 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3413122Z dist init r=3, world=4
2025-12-04T13:21:31.3413324Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3413663Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3414147Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3414627Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3415103Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3415549Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3415988Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3416449Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3416909Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3417372Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3417845Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3418334Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3418790Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3419252Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3419909Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024.
2025-12-04T13:21:31.3420554Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3420914Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3421496Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3421996Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3422358Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3422771Z [rank2]:E1204 12:59:06.185000 530967 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3423010Z dist init r=2, world=4
2025-12-04T13:21:31.3423211Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3423547Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3424031Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3424508Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3424985Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3425432Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3425869Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3426329Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3426804Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3427265Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3427725Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3428224Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3428680Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3429143Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3429815Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240.
2025-12-04T13:21:31.3430458Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3430804Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3431389Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3431887Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3432249Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3432660Z [rank1]:E1204 12:59:06.195000 530966 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3432896Z dist init r=1, world=4
2025-12-04T13:21:31.3433296Z [rank0]:[W1204 12:59:06.961221808 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3433707Z FAILED [9.7141s] [100%]
2025-12-04T13:21:31.3433773Z 
2025-12-04T13:21:31.3433832Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3434023Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____
2025-12-04T13:21:31.3434199Z Traceback (most recent call last):
2025-12-04T13:21:31.3434444Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3434685Z     self._join_processes(fn)
2025-12-04T13:21:31.3434930Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3435193Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3435458Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3435715Z     raise RuntimeError(error)
2025-12-04T13:21:31.3435888Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3436048Z Traceback (most recent call last):
2025-12-04T13:21:31.3436286Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3436526Z     getattr(self, test_name)()
2025-12-04T13:21:31.3436756Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3436984Z     fn()
2025-12-04T13:21:31.3437185Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3437414Z     method(*args, **kwargs)
2025-12-04T13:21:31.3437634Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3437861Z     method(*args, **kwargs)
2025-12-04T13:21:31.3438078Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3438336Z     with policy():
2025-12-04T13:21:31.3438567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3438828Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3439242Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3439624Z 
2025-12-04T13:21:31.3439698Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3440035Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3440297Z 
2025-12-04T13:21:31.3440386Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3440509Z 
2025-12-04T13:21:31.3440511Z 
2025-12-04T13:21:31.3440594Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3440795Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3441150Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4bd3df182e000d22.xml -
2025-12-04T13:21:31.3441477Z =========================== short test summary info ============================
2025-12-04T13:21:31.3441819Z FAILED [9.7141s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3442139Z Traceback (most recent call last):
2025-12-04T13:21:31.3442388Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3442628Z     getattr(self, test_name)()
2025-12-04T13:21:31.3442862Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3443094Z     fn()
2025-12-04T13:21:31.3443295Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3443523Z     method(*args, **kwargs)
2025-12-04T13:21:31.3443740Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3443969Z     method(*args, **kwargs)
2025-12-04T13:21:31.3444185Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3444409Z     with policy():
2025-12-04T13:21:31.3444634Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3444863Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3445282Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3445660Z 
2025-12-04T13:21:31.3445735Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3446069Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3446329Z 
2025-12-04T13:21:31.3446416Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3446603Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3446765Z ======================= 1 failed, 18 deselected in 9.85s =======================
2025-12-04T13:21:31.3446913Z Got exit code 1
2025-12-04T13:21:31.3447020Z Retrying single test...
2025-12-04T13:21:31.3447288Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-97bb0ef2ed351f4f.xml
2025-12-04T13:21:31.3447568Z ============================= test session starts ==============================
2025-12-04T13:21:31.3447777Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3447961Z cachedir: .pytest_cache
2025-12-04T13:21:31.3448214Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3448450Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3448566Z configfile: pytest.ini
2025-12-04T13:21:31.3448791Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3449063Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3449391Z stepcurrent: skipping 1 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3449685Z Running 1 items in this shard
2025-12-04T13:21:31.3449758Z 
2025-12-04T13:21:31.3450060Z distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda I1204 12:59:10.585000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 531367
2025-12-04T13:21:31.3450546Z I1204 12:59:10.586000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 531368
2025-12-04T13:21:31.3450887Z I1204 12:59:10.587000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 531369
2025-12-04T13:21:31.3451226Z I1204 12:59:10.588000 531298 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 531370
2025-12-04T13:21:31.3451778Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3452225Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3452805Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3453404Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3453856Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3454293Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3454867Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3455449Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3455914Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3456362Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3456943Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3457523Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3457969Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.3458435Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.3459004Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3459585Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3459825Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3460171Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3460665Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3461145Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3461623Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3462071Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3462524Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3462989Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3463456Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3463919Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3464380Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3464829Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3465298Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3465783Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3466466Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3467088Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3467438Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3468026Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3468559Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3468923Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3469338Z [rank0]:E1204 12:59:18.583000 531367 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3469579Z dist init r=0, world=4
2025-12-04T13:21:31.3469786Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3470123Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3470608Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3471085Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3471560Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3472026Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3472465Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3472931Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3473394Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3473854Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3474317Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3474785Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3475265Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3475730Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3476390Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 2. CUDA driver allocated memory was 2300575744 and is now 3380609024.
2025-12-04T13:21:31.3477016Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3477366Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3477952Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3478585Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3478953Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3479368Z [rank2]:E1204 12:59:18.610000 531369 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3479609Z dist init r=2, world=4
2025-12-04T13:21:31.3479813Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3480151Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3480640Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3481118Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3481615Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3482064Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3482502Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3482967Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3483430Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3483909Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3484389Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3484859Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3485315Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3485783Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3486444Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 3. CUDA driver allocated memory was 2250244096 and is now 3330277376.
2025-12-04T13:21:31.3487066Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3487416Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3488001Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3488117Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3488370Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3488539Z [rank3]:E1204 12:59:18.622000 531370 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3488578Z dist init r=3, world=4
2025-12-04T13:21:31.3488719Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3488879Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3489181Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3489336Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3489625Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3489750Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3490030Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3490180Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3490475Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3490651Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3490931Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3491070Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3491351Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3491501Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3491976Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 1. CUDA driver allocated memory was 2317352960 and is now 3397386240.
2025-12-04T13:21:31.3492091Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3492289Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3492643Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3492760Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3492974Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3493140Z [rank1]:E1204 12:59:18.667000 531368 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3493181Z dist init r=1, world=4
2025-12-04T13:21:31.3493532Z [rank0]:[W1204 12:59:18.416892830 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3493576Z FAILED [9.8144s] [100%]
2025-12-04T13:21:31.3493578Z 
2025-12-04T13:21:31.3493637Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3493734Z ____ TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda ____
2025-12-04T13:21:31.3493781Z Traceback (most recent call last):
2025-12-04T13:21:31.3493946Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3493990Z     self._join_processes(fn)
2025-12-04T13:21:31.3494165Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3494220Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3494400Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3494456Z     raise RuntimeError(error)
2025-12-04T13:21:31.3494555Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3494616Z Traceback (most recent call last):
2025-12-04T13:21:31.3494780Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3494824Z     getattr(self, test_name)()
2025-12-04T13:21:31.3494982Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3495018Z     fn()
2025-12-04T13:21:31.3495170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3495212Z     method(*args, **kwargs)
2025-12-04T13:21:31.3495365Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3495407Z     method(*args, **kwargs)
2025-12-04T13:21:31.3495561Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3495601Z     with policy():
2025-12-04T13:21:31.3495753Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3495795Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3496142Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3496145Z 
2025-12-04T13:21:31.3496224Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3496452Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3496457Z 
2025-12-04T13:21:31.3496546Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3496548Z 
2025-12-04T13:21:31.3496550Z 
2025-12-04T13:21:31.3496628Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3496716Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3496954Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-97bb0ef2ed351f4f.xml -
2025-12-04T13:21:31.3497016Z =========================== short test summary info ============================
2025-12-04T13:21:31.3497271Z FAILED [9.8144s] distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3497319Z Traceback (most recent call last):
2025-12-04T13:21:31.3497486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3497530Z     getattr(self, test_name)()
2025-12-04T13:21:31.3497693Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3497727Z     fn()
2025-12-04T13:21:31.3497881Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3497922Z     method(*args, **kwargs)
2025-12-04T13:21:31.3498075Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3498115Z     method(*args, **kwargs)
2025-12-04T13:21:31.3498310Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3498380Z     with policy():
2025-12-04T13:21:31.3498533Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3498590Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3498937Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda! Caching allocator allocated memory was 512 and is now reported as 19456 on device 0. CUDA driver allocated memory was 2453667840 and is now 3533701120.
2025-12-04T13:21:31.3498940Z 
2025-12-04T13:21:31.3499016Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3499244Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestHooksCUDA.test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3499246Z 
2025-12-04T13:21:31.3499335Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3499400Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3499465Z ======================= 1 failed, 18 deselected in 9.95s =======================
2025-12-04T13:21:31.3499502Z Got exit code 1
2025-12-04T13:21:31.3499679Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda
2025-12-04T13:21:31.3499807Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.3500002Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c3d64beaed0e8212.xml
2025-12-04T13:21:31.3500064Z ============================= test session starts ==============================
2025-12-04T13:21:31.3500178Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3500221Z cachedir: .pytest_cache
2025-12-04T13:21:31.3500382Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3500431Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3500472Z configfile: pytest.ini
2025-12-04T13:21:31.3500637Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3500712Z collecting ... collected 60 items / 2 deselected / 58 selected
2025-12-04T13:21:31.3500768Z stepcurrent: skipping 2 already run items.
2025-12-04T13:21:31.3500812Z Running 17 items in this shard
2025-12-04T13:21:31.3500814Z 
2025-12-04T13:21:31.3501134Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 12:59:23.001000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 531769
2025-12-04T13:21:31.3501291Z I1204 12:59:23.002000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 531770
2025-12-04T13:21:31.3501448Z I1204 12:59:23.002000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 531771
2025-12-04T13:21:31.3501599Z I1204 12:59:23.003000 531700 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 531772
2025-12-04T13:21:31.3502189Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3502229Z   _warn_cpu_init()
2025-12-04T13:21:31.3502815Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3502877Z   _warn_cpu_init()
2025-12-04T13:21:31.3503447Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3503488Z   _warn_cpu_init()
2025-12-04T13:21:31.3504055Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3504092Z   _warn_cpu_init()
2025-12-04T13:21:31.3504387Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3504431Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3504579Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3504744Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3505036Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3505192Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3505490Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3505617Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3505897Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3506048Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3506325Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3506474Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3506762Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3506912Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3507209Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3507358Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3507838Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3507957Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3508207Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3508567Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3508682Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3508897Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3509064Z [rank0]:E1204 12:59:54.239000 531769 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3509107Z dist init r=0, world=4
2025-12-04T13:21:31.3509247Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3509409Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3509699Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3509870Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3510158Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3510284Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3510564Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3510712Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3510990Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3511156Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3511458Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3511597Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3511876Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3512027Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3512503Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3512621Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3512816Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3513173Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3513290Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3513502Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3513670Z [rank1]:E1204 12:59:54.245000 531770 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3513810Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3513972Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3514270Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3514428Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3514716Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3514840Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3515116Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3515265Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3515553Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3515721Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3515998Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3516136Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3516416Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3516566Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3517041Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3517158Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3517355Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3517711Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3517827Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3518038Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3518244Z [rank2]:E1204 12:59:54.245000 531771 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3518283Z dist init r=1, world=4
2025-12-04T13:21:31.3518323Z dist init r=2, world=4
2025-12-04T13:21:31.3518477Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3518642Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3518932Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3519089Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3519374Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3519501Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3519794Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3519970Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3520246Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3520393Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3520674Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3520813Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3521094Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3521245Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3521719Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3521836Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3522034Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3522390Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3522505Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3522727Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3522895Z [rank3]:E1204 12:59:54.249000 531772 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3522935Z dist init r=3, world=4
2025-12-04T13:21:31.3523280Z [rank0]:[W1204 12:59:54.080538617 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3523323Z FAILED [33.1317s] [  5%]
2025-12-04T13:21:31.3523325Z 
2025-12-04T13:21:31.3523384Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3523486Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____
2025-12-04T13:21:31.3523534Z Traceback (most recent call last):
2025-12-04T13:21:31.3523699Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3523745Z     self._join_processes(fn)
2025-12-04T13:21:31.3523929Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3524008Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3524186Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3524232Z     raise RuntimeError(error)
2025-12-04T13:21:31.3524314Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3524361Z Traceback (most recent call last):
2025-12-04T13:21:31.3524525Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3524568Z     getattr(self, test_name)()
2025-12-04T13:21:31.3524730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3524765Z     fn()
2025-12-04T13:21:31.3524920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3524962Z     method(*args, **kwargs)
2025-12-04T13:21:31.3525117Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3525157Z     method(*args, **kwargs)
2025-12-04T13:21:31.3525310Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3525347Z     with policy():
2025-12-04T13:21:31.3525502Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3525543Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3525896Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3525900Z 
2025-12-04T13:21:31.3525976Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3526207Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3526210Z 
2025-12-04T13:21:31.3526300Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3526302Z 
2025-12-04T13:21:31.3526363Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3526411Z Traceback (most recent call last):
2025-12-04T13:21:31.3526591Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3526635Z     getattr(self, test_name)()
2025-12-04T13:21:31.3526796Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3526836Z     fn()
2025-12-04T13:21:31.3526987Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3527029Z     method(*args, **kwargs)
2025-12-04T13:21:31.3527180Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3527222Z     method(*args, **kwargs)
2025-12-04T13:21:31.3527373Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3527411Z     with policy():
2025-12-04T13:21:31.3527566Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3527609Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3527968Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3527991Z 
2025-12-04T13:21:31.3528071Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3528346Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3528348Z 
2025-12-04T13:21:31.3528435Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3528437Z 
2025-12-04T13:21:31.3528439Z 
2025-12-04T13:21:31.3528519Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3528608Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3528847Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c3d64beaed0e8212.xml -
2025-12-04T13:21:31.3528910Z =========================== short test summary info ============================
2025-12-04T13:21:31.3529158Z FAILED [33.1317s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3529204Z Traceback (most recent call last):
2025-12-04T13:21:31.3529370Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3529413Z     getattr(self, test_name)()
2025-12-04T13:21:31.3529575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3529610Z     fn()
2025-12-04T13:21:31.3529777Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3529823Z     method(*args, **kwargs)
2025-12-04T13:21:31.3529975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3530018Z     method(*args, **kwargs)
2025-12-04T13:21:31.3530169Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3530208Z     with policy():
2025-12-04T13:21:31.3530360Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3530402Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3530785Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3530789Z 
2025-12-04T13:21:31.3530865Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3531094Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3531096Z 
2025-12-04T13:21:31.3531183Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3531186Z 
2025-12-04T13:21:31.3531245Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3531291Z Traceback (most recent call last):
2025-12-04T13:21:31.3531456Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3531499Z     getattr(self, test_name)()
2025-12-04T13:21:31.3531675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3531739Z     fn()
2025-12-04T13:21:31.3531892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3531933Z     method(*args, **kwargs)
2025-12-04T13:21:31.3532085Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3532124Z     method(*args, **kwargs)
2025-12-04T13:21:31.3532277Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3532314Z     with policy():
2025-12-04T13:21:31.3532468Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3532510Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3532860Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3532864Z 
2025-12-04T13:21:31.3532938Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3533167Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3533170Z 
2025-12-04T13:21:31.3533257Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3533323Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3533389Z ======================= 1 failed, 2 deselected in 33.27s =======================
2025-12-04T13:21:31.3533426Z Got exit code 1
2025-12-04T13:21:31.3533469Z Retrying single test...
2025-12-04T13:21:31.3533662Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0e3e8cedde9f2a88.xml
2025-12-04T13:21:31.3533724Z ============================= test session starts ==============================
2025-12-04T13:21:31.3533838Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3533882Z cachedir: .pytest_cache
2025-12-04T13:21:31.3534043Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3534091Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3534132Z configfile: pytest.ini
2025-12-04T13:21:31.3534308Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3534385Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3534611Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3534655Z Running 1 items in this shard
2025-12-04T13:21:31.3534657Z 
2025-12-04T13:21:31.3534961Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 12:59:58.440000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 532171
2025-12-04T13:21:31.3535115Z I1204 12:59:58.441000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 532172
2025-12-04T13:21:31.3535269Z I1204 12:59:58.441000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 532173
2025-12-04T13:21:31.3535424Z I1204 12:59:58.442000 532102 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 532174
2025-12-04T13:21:31.3536026Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3536075Z   _warn_cpu_init()
2025-12-04T13:21:31.3536643Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3536682Z   _warn_cpu_init()
2025-12-04T13:21:31.3537253Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3537290Z   _warn_cpu_init()
2025-12-04T13:21:31.3537858Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3537897Z   _warn_cpu_init()
2025-12-04T13:21:31.3538232Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3538276Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3538421Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3538586Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3538891Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3539053Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3539341Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3539469Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3539748Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3539900Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3540191Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3540368Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3540645Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3540783Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3541065Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3541214Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3541699Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3541816Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3542013Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3542372Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3542489Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3542702Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3542866Z [rank3]:E1204 13:00:29.647000 532174 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3542908Z dist init r=3, world=4
2025-12-04T13:21:31.3543057Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3543220Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3543510Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3543666Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3543957Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3544083Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3544371Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3544540Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3544817Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3544966Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3545243Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3545383Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3545661Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3545811Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3546291Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3546408Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3546607Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3546963Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3547079Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3547302Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3547468Z [rank2]:E1204 13:00:29.651000 532173 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3547509Z dist init r=2, world=4
2025-12-04T13:21:31.3547648Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3547809Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3548097Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3548312Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3548613Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3548751Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3549042Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3549193Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3549470Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3549621Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3549900Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3550038Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3550315Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3550464Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3550944Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3551060Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3551258Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3551613Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3551740Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3551954Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3552119Z [rank1]:E1204 13:00:29.653000 532172 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3552161Z dist init r=1, world=4
2025-12-04T13:21:31.3552298Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3552459Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3552747Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3552913Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3553226Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3553349Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3553628Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3553777Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3554055Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3554205Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3554483Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3554623Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3554902Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3555055Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3555532Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3555649Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3555846Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3556213Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3556332Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3556542Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3556709Z [rank0]:E1204 13:00:29.654000 532171 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3556748Z dist init r=0, world=4
2025-12-04T13:21:31.3557087Z [rank0]:[W1204 13:00:29.506401813 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3557149Z FAILED [33.0345s] [100%]
2025-12-04T13:21:31.3557168Z 
2025-12-04T13:21:31.3557227Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3557340Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____
2025-12-04T13:21:31.3557388Z Traceback (most recent call last):
2025-12-04T13:21:31.3557554Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3557599Z     self._join_processes(fn)
2025-12-04T13:21:31.3557774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3557828Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3558009Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3558053Z     raise RuntimeError(error)
2025-12-04T13:21:31.3558138Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3558208Z Traceback (most recent call last):
2025-12-04T13:21:31.3558371Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3558413Z     getattr(self, test_name)()
2025-12-04T13:21:31.3558572Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3558606Z     fn()
2025-12-04T13:21:31.3558759Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3558800Z     method(*args, **kwargs)
2025-12-04T13:21:31.3558957Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3558997Z     method(*args, **kwargs)
2025-12-04T13:21:31.3559150Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3559189Z     with policy():
2025-12-04T13:21:31.3559344Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3559385Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3559739Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3559741Z 
2025-12-04T13:21:31.3559819Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3560069Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3560073Z 
2025-12-04T13:21:31.3560165Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3560168Z 
2025-12-04T13:21:31.3560170Z 
2025-12-04T13:21:31.3560245Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3560335Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3560571Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0e3e8cedde9f2a88.xml -
2025-12-04T13:21:31.3560633Z =========================== short test summary info ============================
2025-12-04T13:21:31.3560880Z FAILED [33.0345s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3560928Z Traceback (most recent call last):
2025-12-04T13:21:31.3561107Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3561182Z     getattr(self, test_name)()
2025-12-04T13:21:31.3561344Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3561379Z     fn()
2025-12-04T13:21:31.3561534Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3561575Z     method(*args, **kwargs)
2025-12-04T13:21:31.3561728Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3561768Z     method(*args, **kwargs)
2025-12-04T13:21:31.3561922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3561959Z     with policy():
2025-12-04T13:21:31.3562116Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3562159Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3562511Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3562513Z 
2025-12-04T13:21:31.3562587Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3562817Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3562820Z 
2025-12-04T13:21:31.3562908Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3562975Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3563041Z ====================== 1 failed, 18 deselected in 33.17s =======================
2025-12-04T13:21:31.3563081Z Got exit code 1
2025-12-04T13:21:31.3563122Z Retrying single test...
2025-12-04T13:21:31.3563314Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-85422e17b079f439.xml
2025-12-04T13:21:31.3563374Z ============================= test session starts ==============================
2025-12-04T13:21:31.3563487Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3563529Z cachedir: .pytest_cache
2025-12-04T13:21:31.3563698Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3563745Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3563786Z configfile: pytest.ini
2025-12-04T13:21:31.3563952Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3564027Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3564253Z stepcurrent: skipping 2 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3564297Z Running 1 items in this shard
2025-12-04T13:21:31.3564299Z 
2025-12-04T13:21:31.3564608Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda I1204 13:00:33.883000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 532573
2025-12-04T13:21:31.3564765Z I1204 13:00:33.883000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 532574
2025-12-04T13:21:31.3564932Z I1204 13:00:33.884000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 532575
2025-12-04T13:21:31.3565105Z I1204 13:00:33.884000 532504 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 532576
2025-12-04T13:21:31.3565683Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3565723Z   _warn_cpu_init()
2025-12-04T13:21:31.3566291Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3566333Z   _warn_cpu_init()
2025-12-04T13:21:31.3566902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3566941Z   _warn_cpu_init()
2025-12-04T13:21:31.3567505Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3567544Z   _warn_cpu_init()
2025-12-04T13:21:31.3567836Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3567879Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3568035Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3568243Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3568536Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3568695Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3568984Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3569111Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3569409Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3569585Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3569865Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3570015Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3570293Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3570432Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3570713Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3570862Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3571343Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3571462Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3571659Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3572018Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3572134Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3572347Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3572524Z [rank0]:E1204 13:01:05.237000 532573 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3572568Z dist init r=0, world=4
2025-12-04T13:21:31.3572707Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3572870Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3587739Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3587926Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3588282Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3588479Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3588782Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3588935Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3589213Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3589363Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3589640Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3589781Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3590060Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3590208Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3590693Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3590814Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3591013Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3591375Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3591504Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3591721Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3591889Z [rank3]:E1204 13:01:05.243000 532576 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3591931Z dist init r=3, world=4
2025-12-04T13:21:31.3592072Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3592232Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3592521Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3592687Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3592984Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3593122Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3593400Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3593549Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3593830Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3593978Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3594255Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3594392Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3594670Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3594819Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3595298Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3595415Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3595611Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3595981Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3596099Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3596312Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3596477Z [rank1]:E1204 13:01:05.282000 532574 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3596517Z dist init r=1, world=4
2025-12-04T13:21:31.3596656Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3596816Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3597116Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3597298Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3597584Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3597708Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3597985Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3598136Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3598461Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3598610Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3598885Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3599023Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3599301Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3599451Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3599926Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3600059Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3600256Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3600612Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3600727Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3600941Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3601106Z [rank2]:E1204 13:01:05.299000 532575 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3601145Z dist init r=2, world=4
2025-12-04T13:21:31.3601497Z [rank0]:[W1204 13:01:05.081495291 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3601580Z FAILED [33.2375s] [100%]
2025-12-04T13:21:31.3601583Z 
2025-12-04T13:21:31.3601643Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3601746Z ____ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda _____
2025-12-04T13:21:31.3601793Z Traceback (most recent call last):
2025-12-04T13:21:31.3601961Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3602005Z     self._join_processes(fn)
2025-12-04T13:21:31.3602181Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3602236Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3602416Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3602461Z     raise RuntimeError(error)
2025-12-04T13:21:31.3602544Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3602590Z Traceback (most recent call last):
2025-12-04T13:21:31.3602752Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3602795Z     getattr(self, test_name)()
2025-12-04T13:21:31.3602954Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3602989Z     fn()
2025-12-04T13:21:31.3603143Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3603186Z     method(*args, **kwargs)
2025-12-04T13:21:31.3603339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3603382Z     method(*args, **kwargs)
2025-12-04T13:21:31.3603532Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3603569Z     with policy():
2025-12-04T13:21:31.3603720Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3603761Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3604124Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3604128Z 
2025-12-04T13:21:31.3604207Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3604438Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3604442Z 
2025-12-04T13:21:31.3604532Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3604535Z 
2025-12-04T13:21:31.3604596Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3604642Z Traceback (most recent call last):
2025-12-04T13:21:31.3604807Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3604848Z     getattr(self, test_name)()
2025-12-04T13:21:31.3605010Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3605044Z     fn()
2025-12-04T13:21:31.3605208Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3605269Z     method(*args, **kwargs)
2025-12-04T13:21:31.3605421Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3605461Z     method(*args, **kwargs)
2025-12-04T13:21:31.3605611Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3605647Z     with policy():
2025-12-04T13:21:31.3605800Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3605839Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3606191Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3606195Z 
2025-12-04T13:21:31.3606271Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3606499Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3606502Z 
2025-12-04T13:21:31.3606591Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3606593Z 
2025-12-04T13:21:31.3606595Z 
2025-12-04T13:21:31.3606672Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3606762Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3606997Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-85422e17b079f439.xml -
2025-12-04T13:21:31.3607061Z =========================== short test summary info ============================
2025-12-04T13:21:31.3607311Z FAILED [33.2375s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3607359Z Traceback (most recent call last):
2025-12-04T13:21:31.3607523Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3607566Z     getattr(self, test_name)()
2025-12-04T13:21:31.3607728Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3607761Z     fn()
2025-12-04T13:21:31.3607926Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3607965Z     method(*args, **kwargs)
2025-12-04T13:21:31.3608119Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3608202Z     method(*args, **kwargs)
2025-12-04T13:21:31.3608353Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3608389Z     with policy():
2025-12-04T13:21:31.3608542Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3608581Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3608934Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3608936Z 
2025-12-04T13:21:31.3609027Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3609270Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3609444Z 
2025-12-04T13:21:31.3609531Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3609534Z 
2025-12-04T13:21:31.3609592Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3609637Z Traceback (most recent call last):
2025-12-04T13:21:31.3609798Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3609840Z     getattr(self, test_name)()
2025-12-04T13:21:31.3610000Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3610034Z     fn()
2025-12-04T13:21:31.3610185Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3610226Z     method(*args, **kwargs)
2025-12-04T13:21:31.3610376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3610417Z     method(*args, **kwargs)
2025-12-04T13:21:31.3610566Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3610604Z     with policy():
2025-12-04T13:21:31.3610754Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3610794Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3611143Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3611146Z 
2025-12-04T13:21:31.3611221Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3611448Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3611451Z 
2025-12-04T13:21:31.3611537Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3611603Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3611666Z ====================== 1 failed, 18 deselected in 33.38s =======================
2025-12-04T13:21:31.3611704Z Got exit code 1
2025-12-04T13:21:31.3611893Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda
2025-12-04T13:21:31.3612023Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.3612214Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3c8429d2d3d8f75c.xml
2025-12-04T13:21:31.3612274Z ============================= test session starts ==============================
2025-12-04T13:21:31.3612390Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3612432Z cachedir: .pytest_cache
2025-12-04T13:21:31.3612591Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3612639Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3612680Z configfile: pytest.ini
2025-12-04T13:21:31.3612846Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3612930Z collecting ... collected 60 items / 3 deselected / 57 selected
2025-12-04T13:21:31.3612994Z stepcurrent: skipping 3 already run items.
2025-12-04T13:21:31.3613047Z Running 16 items in this shard
2025-12-04T13:21:31.3613049Z 
2025-12-04T13:21:31.3613369Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 13:01:09.649000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 532975
2025-12-04T13:21:31.3613525Z I1204 13:01:09.650000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 532976
2025-12-04T13:21:31.3613677Z I1204 13:01:09.651000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 532977
2025-12-04T13:21:31.3613830Z I1204 13:01:09.652000 532906 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 532978
2025-12-04T13:21:31.3614416Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3614458Z   _warn_cpu_init()
2025-12-04T13:21:31.3615025Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3615063Z   _warn_cpu_init()
2025-12-04T13:21:31.3615633Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3615670Z   _warn_cpu_init()
2025-12-04T13:21:31.3616247Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3616284Z   _warn_cpu_init()
2025-12-04T13:21:31.3616579Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3616623Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3616768Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3616931Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3617221Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3617386Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3617696Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3617823Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3618100Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3618284Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3618563Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3618712Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3618990Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3619128Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3619407Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3619557Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3620046Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3620162Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3620370Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3620740Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3620856Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3621070Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3621234Z [rank2]:E1204 13:01:40.982000 532977 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3621274Z dist init r=2, world=4
2025-12-04T13:21:31.3621417Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3621589Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3621887Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3622052Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3622336Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3622461Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3622739Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3622888Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3623164Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3623311Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3623588Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3623726Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3624004Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3624154Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3624649Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3624763Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3624960Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3625326Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3625441Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3625654Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3625819Z [rank3]:E1204 13:01:40.994000 532978 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3625876Z dist init r=3, world=4
2025-12-04T13:21:31.3626016Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3626187Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3626474Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3626628Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3626912Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3627038Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3627315Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3627461Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3627737Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3627885Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3628201Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3628337Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3628616Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3628763Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3629261Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3629378Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3629573Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3629936Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3630050Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3630274Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3630461Z [rank0]:E1204 13:01:41.034000 532975 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3630500Z dist init r=0, world=4
2025-12-04T13:21:31.3630637Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3630798Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3631086Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3631239Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3631524Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3631646Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3631922Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3632069Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3632345Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3632494Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3632769Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3632905Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3633193Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3633342Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3633826Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3633940Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3634137Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3634517Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3634652Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3634862Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3635026Z [rank1]:E1204 13:01:41.042000 532976 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3635063Z dist init r=1, world=4
2025-12-04T13:21:31.3635407Z [rank0]:[W1204 13:01:41.978118501 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3635449Z FAILED [33.2346s] [  6%]
2025-12-04T13:21:31.3635452Z 
2025-12-04T13:21:31.3635510Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3635618Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _
2025-12-04T13:21:31.3635663Z Traceback (most recent call last):
2025-12-04T13:21:31.3635828Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3635871Z     self._join_processes(fn)
2025-12-04T13:21:31.3636044Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3636097Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3636279Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3636323Z     raise RuntimeError(error)
2025-12-04T13:21:31.3636405Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3636450Z Traceback (most recent call last):
2025-12-04T13:21:31.3636610Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3636652Z     getattr(self, test_name)()
2025-12-04T13:21:31.3636810Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3636844Z     fn()
2025-12-04T13:21:31.3636996Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3637036Z     method(*args, **kwargs)
2025-12-04T13:21:31.3637197Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3637237Z     method(*args, **kwargs)
2025-12-04T13:21:31.3637390Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3637428Z     with policy():
2025-12-04T13:21:31.3637581Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3637621Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3637981Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3637984Z 
2025-12-04T13:21:31.3638061Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3638363Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3638381Z 
2025-12-04T13:21:31.3638481Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3638483Z 
2025-12-04T13:21:31.3638485Z 
2025-12-04T13:21:31.3638561Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3638651Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3638885Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-3c8429d2d3d8f75c.xml -
2025-12-04T13:21:31.3638947Z =========================== short test summary info ============================
2025-12-04T13:21:31.3639203Z FAILED [33.2346s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3639250Z Traceback (most recent call last):
2025-12-04T13:21:31.3639415Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3639457Z     getattr(self, test_name)()
2025-12-04T13:21:31.3639618Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3639652Z     fn()
2025-12-04T13:21:31.3639804Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3639843Z     method(*args, **kwargs)
2025-12-04T13:21:31.3639994Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3640035Z     method(*args, **kwargs)
2025-12-04T13:21:31.3640186Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3640224Z     with policy():
2025-12-04T13:21:31.3640376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3640418Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3640777Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3640779Z 
2025-12-04T13:21:31.3640853Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3641105Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3641107Z 
2025-12-04T13:21:31.3641199Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3641262Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3641326Z ======================= 1 failed, 3 deselected in 33.37s =======================
2025-12-04T13:21:31.3641364Z Got exit code 1
2025-12-04T13:21:31.3641404Z Retrying single test...
2025-12-04T13:21:31.3641593Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c20ad9eb622651c0.xml
2025-12-04T13:21:31.3641651Z ============================= test session starts ==============================
2025-12-04T13:21:31.3641764Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3641807Z cachedir: .pytest_cache
2025-12-04T13:21:31.3641964Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3642021Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3642071Z configfile: pytest.ini
2025-12-04T13:21:31.3642248Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3642323Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3642558Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3642601Z Running 1 items in this shard
2025-12-04T13:21:31.3642603Z 
2025-12-04T13:21:31.3642922Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 13:01:45.518000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 533377
2025-12-04T13:21:31.3643079Z I1204 13:01:45.519000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 533378
2025-12-04T13:21:31.3643230Z I1204 13:01:45.520000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 533379
2025-12-04T13:21:31.3643383Z I1204 13:01:45.521000 533308 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 533380
2025-12-04T13:21:31.3643960Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3643999Z   _warn_cpu_init()
2025-12-04T13:21:31.3644568Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3644608Z   _warn_cpu_init()
2025-12-04T13:21:31.3645185Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3645222Z   _warn_cpu_init()
2025-12-04T13:21:31.3645787Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3645826Z   _warn_cpu_init()
2025-12-04T13:21:31.3646117Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3646160Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3646303Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3646476Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3646785Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3646942Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3647228Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3647354Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3647632Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3647783Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3648062Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3648255Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3648533Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3648672Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3648956Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3649104Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3649611Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3649729Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3649925Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3650294Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3650408Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3650622Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3650800Z [rank2]:E1204 13:02:16.914000 533379 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3650963Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3651123Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3651412Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3651565Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3651851Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3651977Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3652254Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3652402Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3652679Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3652825Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3653102Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3653238Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3653519Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3653666Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3654163Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3654280Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3654475Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3654841Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3654955Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3655187Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3655361Z [rank1]:E1204 13:02:16.914000 533378 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3655400Z dist init r=2, world=4
2025-12-04T13:21:31.3655439Z dist init r=1, world=4
2025-12-04T13:21:31.3655577Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3655737Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3656032Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3656187Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3656471Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3656595Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3656870Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3657019Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3657295Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3657443Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3657719Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3657855Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3658183Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3658333Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3658818Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3658934Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3659129Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3659512Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3659650Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3659861Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3660024Z [rank0]:E1204 13:02:16.973000 533377 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3660163Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3660324Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3660614Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3660769Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3661054Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3661178Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3661455Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3661605Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3661881Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3662027Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3662316Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3662452Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3662732Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3662881Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3663367Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3663481Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3663696Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3664078Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3664191Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3664403Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3664568Z [rank3]:E1204 13:02:16.974000 533380 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3664608Z dist init r=0, world=4
2025-12-04T13:21:31.3664645Z dist init r=3, world=4
2025-12-04T13:21:31.3664987Z [rank0]:[W1204 13:02:17.953865349 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3665028Z FAILED [33.2347s] [100%]
2025-12-04T13:21:31.3665031Z 
2025-12-04T13:21:31.3665087Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3665195Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _
2025-12-04T13:21:31.3665240Z Traceback (most recent call last):
2025-12-04T13:21:31.3665405Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3665448Z     self._join_processes(fn)
2025-12-04T13:21:31.3665622Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3665677Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3665855Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3665898Z     raise RuntimeError(error)
2025-12-04T13:21:31.3665979Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3666024Z Traceback (most recent call last):
2025-12-04T13:21:31.3666187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3666229Z     getattr(self, test_name)()
2025-12-04T13:21:31.3666399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3666433Z     fn()
2025-12-04T13:21:31.3666587Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3666628Z     method(*args, **kwargs)
2025-12-04T13:21:31.3666781Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3666820Z     method(*args, **kwargs)
2025-12-04T13:21:31.3666972Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3667007Z     with policy():
2025-12-04T13:21:31.3667160Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3667200Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3667573Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3667595Z 
2025-12-04T13:21:31.3667671Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3667910Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3667912Z 
2025-12-04T13:21:31.3668001Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3668004Z 
2025-12-04T13:21:31.3668005Z 
2025-12-04T13:21:31.3668080Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3668207Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3668444Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c20ad9eb622651c0.xml -
2025-12-04T13:21:31.3668507Z =========================== short test summary info ============================
2025-12-04T13:21:31.3668764Z FAILED [33.2347s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3668810Z Traceback (most recent call last):
2025-12-04T13:21:31.3668974Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3669016Z     getattr(self, test_name)()
2025-12-04T13:21:31.3669175Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3669210Z     fn()
2025-12-04T13:21:31.3669362Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3669403Z     method(*args, **kwargs)
2025-12-04T13:21:31.3669553Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3669595Z     method(*args, **kwargs)
2025-12-04T13:21:31.3669747Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3669784Z     with policy():
2025-12-04T13:21:31.3669937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3669976Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3670353Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3670355Z 
2025-12-04T13:21:31.3670431Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3670673Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3670675Z 
2025-12-04T13:21:31.3670763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3670826Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3670890Z ====================== 1 failed, 18 deselected in 33.40s =======================
2025-12-04T13:21:31.3670926Z Got exit code 1
2025-12-04T13:21:31.3670967Z Retrying single test...
2025-12-04T13:21:31.3671157Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80f6c7f9f6e17155.xml
2025-12-04T13:21:31.3671229Z ============================= test session starts ==============================
2025-12-04T13:21:31.3671354Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3671409Z cachedir: .pytest_cache
2025-12-04T13:21:31.3671568Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3671615Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3671655Z configfile: pytest.ini
2025-12-04T13:21:31.3671820Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3671894Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3672131Z stepcurrent: skipping 3 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3672173Z Running 1 items in this shard
2025-12-04T13:21:31.3672177Z 
2025-12-04T13:21:31.3672495Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda I1204 13:02:21.497000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 533779
2025-12-04T13:21:31.3672652Z I1204 13:02:21.497000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 533780
2025-12-04T13:21:31.3672803Z I1204 13:02:21.498000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 533781
2025-12-04T13:21:31.3672953Z I1204 13:02:21.499000 533710 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 533782
2025-12-04T13:21:31.3673535Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3673575Z   _warn_cpu_init()
2025-12-04T13:21:31.3674143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3674179Z   _warn_cpu_init()
2025-12-04T13:21:31.3674757Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3674796Z   _warn_cpu_init()
2025-12-04T13:21:31.3675360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3675397Z   _warn_cpu_init()
2025-12-04T13:21:31.3675696Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3676086Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3676229Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3676392Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3676680Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3676837Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3677125Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3677254Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3677532Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3677680Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3677957Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3678104Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3678435Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3678572Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3678849Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3679014Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3679502Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3679622Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3679817Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3680187Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3680332Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3680560Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3680724Z [rank0]:E1204 13:02:52.803000 533779 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3680762Z dist init r=0, world=4
2025-12-04T13:21:31.3680900Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3681059Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3681350Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3681504Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3681792Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3681916Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3682195Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3682344Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3682619Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3682768Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3683043Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3683189Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3683468Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3683618Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3684105Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3684220Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3684416Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3684805Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3684930Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3685141Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3685306Z [rank1]:E1204 13:02:52.804000 533780 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3685347Z dist init r=1, world=4
2025-12-04T13:21:31.3685484Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3685645Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3685933Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3686086Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3686370Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3686496Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3686773Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3686922Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3687197Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3687343Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3687630Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3687766Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3688045Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3688260Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3688745Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 3. CUDA driver allocated memory was 2250244096 and is now 3785359360.
2025-12-04T13:21:31.3688890Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3689098Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3689462Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3689575Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3689787Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3689950Z [rank3]:E1204 13:02:52.863000 533782 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3689992Z dist init r=3, world=4
2025-12-04T13:21:31.3690131Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3690289Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3690575Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3690730Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3691019Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3691144Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3691421Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3691567Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3691856Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3692004Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3692281Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3692417Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3692693Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3692843Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3693342Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 2. CUDA driver allocated memory was 2300575744 and is now 3835691008.
2025-12-04T13:21:31.3693475Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3693671Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3694036Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3694150Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3694362Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3694527Z [rank2]:E1204 13:02:52.864000 533781 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3694565Z dist init r=2, world=4
2025-12-04T13:21:31.3694901Z [rank0]:[W1204 13:02:52.636059224 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3694943Z FAILED [33.2358s] [100%]
2025-12-04T13:21:31.3694946Z 
2025-12-04T13:21:31.3695001Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3695110Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda _
2025-12-04T13:21:31.3695157Z Traceback (most recent call last):
2025-12-04T13:21:31.3695321Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3695364Z     self._join_processes(fn)
2025-12-04T13:21:31.3695538Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3695592Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3695771Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3695813Z     raise RuntimeError(error)
2025-12-04T13:21:31.3695903Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3695948Z Traceback (most recent call last):
2025-12-04T13:21:31.3696111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3696153Z     getattr(self, test_name)()
2025-12-04T13:21:31.3696313Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3696347Z     fn()
2025-12-04T13:21:31.3696500Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3696539Z     method(*args, **kwargs)
2025-12-04T13:21:31.3696690Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3696729Z     method(*args, **kwargs)
2025-12-04T13:21:31.3696881Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3696917Z     with policy():
2025-12-04T13:21:31.3697096Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3697149Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3697507Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3697510Z 
2025-12-04T13:21:31.3697586Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3697830Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3697833Z 
2025-12-04T13:21:31.3697924Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3697928Z 
2025-12-04T13:21:31.3697986Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3698033Z Traceback (most recent call last):
2025-12-04T13:21:31.3698241Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3698283Z     getattr(self, test_name)()
2025-12-04T13:21:31.3698441Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3698476Z     fn()
2025-12-04T13:21:31.3698625Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3698665Z     method(*args, **kwargs)
2025-12-04T13:21:31.3698816Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3698856Z     method(*args, **kwargs)
2025-12-04T13:21:31.3699006Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3699044Z     with policy():
2025-12-04T13:21:31.3699196Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3699236Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3699593Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3699595Z 
2025-12-04T13:21:31.3699667Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3699921Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3699925Z 
2025-12-04T13:21:31.3700012Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3700015Z 
2025-12-04T13:21:31.3700018Z 
2025-12-04T13:21:31.3700092Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3700181Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3700413Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80f6c7f9f6e17155.xml -
2025-12-04T13:21:31.3700474Z =========================== short test summary info ============================
2025-12-04T13:21:31.3700728Z FAILED [33.2358s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3700776Z Traceback (most recent call last):
2025-12-04T13:21:31.3700968Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3701023Z     getattr(self, test_name)()
2025-12-04T13:21:31.3701182Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3701216Z     fn()
2025-12-04T13:21:31.3701368Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3701409Z     method(*args, **kwargs)
2025-12-04T13:21:31.3701559Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3701598Z     method(*args, **kwargs)
2025-12-04T13:21:31.3701748Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3701786Z     with policy():
2025-12-04T13:21:31.3701938Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3701981Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3702344Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 0. CUDA driver allocated memory was 2453667840 and is now 3988783104.
2025-12-04T13:21:31.3702346Z 
2025-12-04T13:21:31.3702419Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3702658Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3702661Z 
2025-12-04T13:21:31.3702746Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3702750Z 
2025-12-04T13:21:31.3702809Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3702854Z Traceback (most recent call last):
2025-12-04T13:21:31.3703017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3703058Z     getattr(self, test_name)()
2025-12-04T13:21:31.3703219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3703252Z     fn()
2025-12-04T13:21:31.3703403Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3703441Z     method(*args, **kwargs)
2025-12-04T13:21:31.3703601Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3703640Z     method(*args, **kwargs)
2025-12-04T13:21:31.3703793Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3703830Z     with policy():
2025-12-04T13:21:31.3703982Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3704023Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3704378Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3852468224.
2025-12-04T13:21:31.3704381Z 
2025-12-04T13:21:31.3704456Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3704706Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3704720Z 
2025-12-04T13:21:31.3704816Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3704879Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3704943Z ====================== 1 failed, 18 deselected in 33.37s =======================
2025-12-04T13:21:31.3704979Z Got exit code 1
2025-12-04T13:21:31.3705167Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.3705297Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.3705488Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c97dc3beffec5ac9.xml
2025-12-04T13:21:31.3705547Z ============================= test session starts ==============================
2025-12-04T13:21:31.3705660Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3705702Z cachedir: .pytest_cache
2025-12-04T13:21:31.3705859Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3705905Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3705945Z configfile: pytest.ini
2025-12-04T13:21:31.3706107Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3706181Z collecting ... collected 60 items / 4 deselected / 56 selected
2025-12-04T13:21:31.3706234Z stepcurrent: skipping 4 already run items.
2025-12-04T13:21:31.3706277Z Running 15 items in this shard
2025-12-04T13:21:31.3706279Z 
2025-12-04T13:21:31.3706596Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:02:57.279000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 534181
2025-12-04T13:21:31.3706752Z I1204 13:02:57.279000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 534182
2025-12-04T13:21:31.3706906Z I1204 13:02:57.280000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 534183
2025-12-04T13:21:31.3707056Z I1204 13:02:57.281000 534112 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 534184
2025-12-04T13:21:31.3707646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3707686Z   _warn_cpu_init()
2025-12-04T13:21:31.3708294Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3708332Z   _warn_cpu_init()
2025-12-04T13:21:31.3708916Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3708979Z   _warn_cpu_init()
2025-12-04T13:21:31.3709551Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3709586Z   _warn_cpu_init()
2025-12-04T13:21:31.3709879Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3709922Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3710066Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3710229Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3710540Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3710696Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3710988Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3711115Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3711396Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3711545Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3711823Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3711991Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3712270Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3712409Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3712686Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3712833Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3713329Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3713465Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3713662Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3714033Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3714148Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3714363Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3714527Z [rank0]:E1204 13:03:34.989000 534181 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3714567Z dist init r=0, world=4
2025-12-04T13:21:31.3714704Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3714865Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3715152Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3715308Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3715595Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3715718Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3715995Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3716151Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3716430Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3716579Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3716855Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3716992Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3717270Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3717430Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3717932Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3718048Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3718276Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3718643Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3718760Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3718971Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3719136Z [rank3]:E1204 13:03:34.997000 534184 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3719175Z dist init r=3, world=4
2025-12-04T13:21:31.3719314Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3719474Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3719763Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3719918Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3720203Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3720328Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3720618Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3720768Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3721045Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3721192Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3721469Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3721625Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3721914Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3722076Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3722560Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3722674Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3722872Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3723238Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3723351Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3723564Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3723727Z [rank1]:E1204 13:03:35.030000 534182 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3723767Z dist init r=1, world=4
2025-12-04T13:21:31.3723905Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3724066Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3724351Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3724506Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3724800Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3724926Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3725205Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3725352Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3725631Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3725780Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3726066Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3726234Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3726511Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3726660Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3727143Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3727260Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3727454Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3727818Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3727933Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3728186Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3728351Z [rank2]:E1204 13:03:35.036000 534183 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3728390Z dist init r=2, world=4
2025-12-04T13:21:31.3728729Z [rank0]:[W1204 13:03:35.903145783 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3728768Z FAILED [39.8366s] [  6%]
2025-12-04T13:21:31.3728770Z 
2025-12-04T13:21:31.3728828Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3728945Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.3728992Z Traceback (most recent call last):
2025-12-04T13:21:31.3729157Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3729202Z     self._join_processes(fn)
2025-12-04T13:21:31.3729375Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3729430Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3729606Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3729651Z     raise RuntimeError(error)
2025-12-04T13:21:31.3729733Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3729777Z Traceback (most recent call last):
2025-12-04T13:21:31.3729940Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3730009Z     getattr(self, test_name)()
2025-12-04T13:21:31.3730169Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3730215Z     fn()
2025-12-04T13:21:31.3730368Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3730409Z     method(*args, **kwargs)
2025-12-04T13:21:31.3730561Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3730600Z     method(*args, **kwargs)
2025-12-04T13:21:31.3730751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3730790Z     with policy():
2025-12-04T13:21:31.3730945Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3730987Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3731347Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3731351Z 
2025-12-04T13:21:31.3731426Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3731666Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3731668Z 
2025-12-04T13:21:31.3731756Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3731760Z 
2025-12-04T13:21:31.3731819Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3731866Z Traceback (most recent call last):
2025-12-04T13:21:31.3732030Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3732073Z     getattr(self, test_name)()
2025-12-04T13:21:31.3732231Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3732266Z     fn()
2025-12-04T13:21:31.3732416Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3732457Z     method(*args, **kwargs)
2025-12-04T13:21:31.3732606Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3732647Z     method(*args, **kwargs)
2025-12-04T13:21:31.3732810Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3732849Z     with policy():
2025-12-04T13:21:31.3733002Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3733045Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3733402Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3733405Z 
2025-12-04T13:21:31.3733479Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3733720Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3733722Z 
2025-12-04T13:21:31.3733809Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3733830Z 
2025-12-04T13:21:31.3733890Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3733945Z Traceback (most recent call last):
2025-12-04T13:21:31.3734108Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3734149Z     getattr(self, test_name)()
2025-12-04T13:21:31.3734308Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3734340Z     fn()
2025-12-04T13:21:31.3734491Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3734530Z     method(*args, **kwargs)
2025-12-04T13:21:31.3734682Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3734721Z     method(*args, **kwargs)
2025-12-04T13:21:31.3734872Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3734909Z     with policy():
2025-12-04T13:21:31.3735062Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3735102Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3735461Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3735463Z 
2025-12-04T13:21:31.3735537Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3735776Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3735779Z 
2025-12-04T13:21:31.3735866Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3735868Z 
2025-12-04T13:21:31.3735870Z 
2025-12-04T13:21:31.3735945Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3736034Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3736269Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-c97dc3beffec5ac9.xml -
2025-12-04T13:21:31.3736332Z =========================== short test summary info ============================
2025-12-04T13:21:31.3736596Z FAILED [39.8366s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3736645Z Traceback (most recent call last):
2025-12-04T13:21:31.3736812Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3736854Z     getattr(self, test_name)()
2025-12-04T13:21:31.3737014Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3737047Z     fn()
2025-12-04T13:21:31.3737199Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3737237Z     method(*args, **kwargs)
2025-12-04T13:21:31.3737388Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3737428Z     method(*args, **kwargs)
2025-12-04T13:21:31.3737578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3737632Z     with policy():
2025-12-04T13:21:31.3737784Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3737834Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3738226Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3738229Z 
2025-12-04T13:21:31.3738301Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3738539Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3738541Z 
2025-12-04T13:21:31.3738630Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3738633Z 
2025-12-04T13:21:31.3738690Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3738736Z Traceback (most recent call last):
2025-12-04T13:21:31.3738898Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3738940Z     getattr(self, test_name)()
2025-12-04T13:21:31.3739098Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3739132Z     fn()
2025-12-04T13:21:31.3739282Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3739322Z     method(*args, **kwargs)
2025-12-04T13:21:31.3739472Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3739513Z     method(*args, **kwargs)
2025-12-04T13:21:31.3739664Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3739701Z     with policy():
2025-12-04T13:21:31.3739852Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3739894Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3740247Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3740251Z 
2025-12-04T13:21:31.3740339Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3740577Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3740581Z 
2025-12-04T13:21:31.3740668Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3740670Z 
2025-12-04T13:21:31.3740729Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3740773Z Traceback (most recent call last):
2025-12-04T13:21:31.3740937Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3740978Z     getattr(self, test_name)()
2025-12-04T13:21:31.3741137Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3741170Z     fn()
2025-12-04T13:21:31.3741321Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3741360Z     method(*args, **kwargs)
2025-12-04T13:21:31.3741543Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3741594Z     method(*args, **kwargs)
2025-12-04T13:21:31.3741745Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3741781Z     with policy():
2025-12-04T13:21:31.3741932Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3741973Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3742330Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3742332Z 
2025-12-04T13:21:31.3742407Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3742643Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3742646Z 
2025-12-04T13:21:31.3742733Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3742797Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3742861Z ======================= 1 failed, 4 deselected in 39.97s =======================
2025-12-04T13:21:31.3742897Z Got exit code 1
2025-12-04T13:21:31.3742938Z Retrying single test...
2025-12-04T13:21:31.3743127Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e9a6375f681e708.xml
2025-12-04T13:21:31.3743185Z ============================= test session starts ==============================
2025-12-04T13:21:31.3743298Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3743342Z cachedir: .pytest_cache
2025-12-04T13:21:31.3743500Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3743548Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3743588Z configfile: pytest.ini
2025-12-04T13:21:31.3743752Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3743827Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3744068Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3744113Z Running 1 items in this shard
2025-12-04T13:21:31.3744115Z 
2025-12-04T13:21:31.3744429Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:03:39.615000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 534583
2025-12-04T13:21:31.3744586Z I1204 13:03:39.616000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 534584
2025-12-04T13:21:31.3744737Z I1204 13:03:39.617000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 534585
2025-12-04T13:21:31.3744889Z I1204 13:03:39.618000 534514 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 534586
2025-12-04T13:21:31.3745482Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3745538Z   _warn_cpu_init()
2025-12-04T13:21:31.3746103Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3746139Z   _warn_cpu_init()
2025-12-04T13:21:31.3746713Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3746751Z   _warn_cpu_init()
2025-12-04T13:21:31.3747313Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3747350Z   _warn_cpu_init()
2025-12-04T13:21:31.3747645Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3747690Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3747833Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3747997Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3748320Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3748488Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3748774Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3748900Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3749180Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3749328Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3749606Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3749766Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3750054Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3750203Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3750481Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3750631Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3751115Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3751234Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3751431Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3751798Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3751913Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3752125Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3752292Z [rank1]:E1204 13:04:17.303000 534584 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3752330Z dist init r=1, world=4
2025-12-04T13:21:31.3752468Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3752626Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3752923Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3753080Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3753374Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3753509Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3753809Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3753991Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3754323Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3754506Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3754807Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3754954Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3755261Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3755443Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3755940Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3756079Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3756292Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3756681Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3756829Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3757057Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3757243Z [rank0]:E1204 13:04:17.319000 534583 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3757297Z dist init r=0, world=4
2025-12-04T13:21:31.3758017Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3758215Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3758544Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3758710Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3759023Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3759170Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3759468Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3759689Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3759975Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3760143Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3760444Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3760586Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3760910Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3761071Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3761576Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3761702Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3761916Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3762328Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3762452Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3762699Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3762875Z [rank3]:E1204 13:04:17.369000 534586 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3762940Z dist init r=3, world=4
2025-12-04T13:21:31.3763101Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3763287Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3763600Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3763765Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3764086Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3764252Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3764559Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3764717Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3765024Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3765188Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3765486Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3765651Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3765939Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3766115Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3766608Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3766755Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3766979Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3767371Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3767507Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3767731Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3767926Z [rank2]:E1204 13:04:17.377000 534585 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3767982Z dist init r=2, world=4
2025-12-04T13:21:31.3768382Z [rank0]:[W1204 13:04:17.179298369 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3768446Z FAILED [39.6407s] [100%]
2025-12-04T13:21:31.3768449Z 
2025-12-04T13:21:31.3768517Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3768657Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.3768754Z Traceback (most recent call last):
2025-12-04T13:21:31.3768939Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3769008Z     self._join_processes(fn)
2025-12-04T13:21:31.3769204Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3769264Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3769488Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3769544Z     raise RuntimeError(error)
2025-12-04T13:21:31.3769648Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3769706Z Traceback (most recent call last):
2025-12-04T13:21:31.3769893Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3769965Z     getattr(self, test_name)()
2025-12-04T13:21:31.3770152Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3770198Z     fn()
2025-12-04T13:21:31.3770372Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3770436Z     method(*args, **kwargs)
2025-12-04T13:21:31.3770623Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3770692Z     method(*args, **kwargs)
2025-12-04T13:21:31.3770854Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3770913Z     with policy():
2025-12-04T13:21:31.3771085Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3771156Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3771531Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3771533Z 
2025-12-04T13:21:31.3771631Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3771882Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3771885Z 
2025-12-04T13:21:31.3772015Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3772017Z 
2025-12-04T13:21:31.3772105Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3772169Z Traceback (most recent call last):
2025-12-04T13:21:31.3772357Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3773643Z     getattr(self, test_name)()
2025-12-04T13:21:31.3773832Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3773872Z     fn()
2025-12-04T13:21:31.3774061Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3774113Z     method(*args, **kwargs)
2025-12-04T13:21:31.3774291Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3774342Z     method(*args, **kwargs)
2025-12-04T13:21:31.3774524Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3774595Z     with policy():
2025-12-04T13:21:31.3774791Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3774842Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3775220Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3775222Z 
2025-12-04T13:21:31.3775317Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3775577Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3775579Z 
2025-12-04T13:21:31.3775700Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3775704Z 
2025-12-04T13:21:31.3775706Z 
2025-12-04T13:21:31.3775791Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3775902Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3776146Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e9a6375f681e708.xml -
2025-12-04T13:21:31.3776239Z =========================== short test summary info ============================
2025-12-04T13:21:31.3776517Z FAILED [39.6407s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3776587Z Traceback (most recent call last):
2025-12-04T13:21:31.3776776Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3776832Z     getattr(self, test_name)()
2025-12-04T13:21:31.3777022Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3777077Z     fn()
2025-12-04T13:21:31.3777251Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3777302Z     method(*args, **kwargs)
2025-12-04T13:21:31.3777476Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3777521Z     method(*args, **kwargs)
2025-12-04T13:21:31.3777729Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3777778Z     with policy():
2025-12-04T13:21:31.3777956Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3778008Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3778659Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3778662Z 
2025-12-04T13:21:31.3778785Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3779033Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3779035Z 
2025-12-04T13:21:31.3779146Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3779148Z 
2025-12-04T13:21:31.3779235Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3779317Z Traceback (most recent call last):
2025-12-04T13:21:31.3779516Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3779587Z     getattr(self, test_name)()
2025-12-04T13:21:31.3779757Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3779820Z     fn()
2025-12-04T13:21:31.3779981Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3780053Z     method(*args, **kwargs)
2025-12-04T13:21:31.3780219Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3780282Z     method(*args, **kwargs)
2025-12-04T13:21:31.3780449Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3780512Z     with policy():
2025-12-04T13:21:31.3780695Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3780752Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3781129Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3781131Z 
2025-12-04T13:21:31.3781219Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3781478Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3781481Z 
2025-12-04T13:21:31.3781574Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3781679Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3781755Z ====================== 1 failed, 18 deselected in 39.81s =======================
2025-12-04T13:21:31.3781821Z Got exit code 1
2025-12-04T13:21:31.3781872Z Retrying single test...
2025-12-04T13:21:31.3782085Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ab74cfc34851cb6b.xml
2025-12-04T13:21:31.3782165Z ============================= test session starts ==============================
2025-12-04T13:21:31.3782306Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3782388Z cachedir: .pytest_cache
2025-12-04T13:21:31.3782559Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3782626Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3782692Z configfile: pytest.ini
2025-12-04T13:21:31.3782891Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3782976Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3783232Z stepcurrent: skipping 4 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3783286Z Running 1 items in this shard
2025-12-04T13:21:31.3783289Z 
2025-12-04T13:21:31.3783640Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda I1204 13:04:22.030000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 534985
2025-12-04T13:21:31.3783831Z I1204 13:04:22.030000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 534986
2025-12-04T13:21:31.3784029Z I1204 13:04:22.031000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 534987
2025-12-04T13:21:31.3784202Z I1204 13:04:22.032000 534916 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 534988
2025-12-04T13:21:31.3784795Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3784869Z   _warn_cpu_init()
2025-12-04T13:21:31.3785457Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3785519Z   _warn_cpu_init()
2025-12-04T13:21:31.3786111Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3786160Z   _warn_cpu_init()
2025-12-04T13:21:31.3786760Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3786813Z   _warn_cpu_init()
2025-12-04T13:21:31.3787129Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3787205Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3787364Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3787560Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3787868Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3788044Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3788386Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3788540Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3788839Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3789054Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3789360Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3789519Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3789821Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3789966Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3790286Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3790450Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3790963Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3791105Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3791307Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3791713Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3791843Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3792094Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3792283Z [rank1]:E1204 13:04:59.690000 534986 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3792329Z dist init r=1, world=4
2025-12-04T13:21:31.3792512Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3792683Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3792993Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3793159Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3793473Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3793648Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3793958Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3794128Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3794416Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3794585Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3794887Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3795054Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3795355Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3795515Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3796021Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3796159Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3796382Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3796765Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3796907Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3797138Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3797323Z [rank2]:E1204 13:04:59.696000 534987 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3797391Z dist init r=2, world=4
2025-12-04T13:21:31.3797539Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3797725Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3798033Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3798264Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3798590Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3798730Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3799030Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3799191Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3799495Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3799660Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3799966Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3800124Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3800414Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3800592Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3801091Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3801233Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3801451Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3801838Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3801984Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3802216Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3802404Z [rank0]:E1204 13:04:59.697000 534985 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3802455Z dist init r=0, world=4
2025-12-04T13:21:31.3802618Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3802795Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3803147Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3805749Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3806045Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3806200Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3806482Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3806670Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3806956Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3807126Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3807432Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3807575Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3807892Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3808054Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3808614Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3808745Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3808960Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3809360Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3809486Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3809725Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3809915Z [rank3]:E1204 13:04:59.737000 534988 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3809999Z dist init r=3, world=4
2025-12-04T13:21:31.3810357Z [rank0]:[W1204 13:04:59.550216122 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3810432Z FAILED [39.5386s] [100%]
2025-12-04T13:21:31.3810435Z 
2025-12-04T13:21:31.3810515Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3810631Z _ TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.3810696Z Traceback (most recent call last):
2025-12-04T13:21:31.3810884Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3810960Z     self._join_processes(fn)
2025-12-04T13:21:31.3811145Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3811222Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3811410Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3811486Z     raise RuntimeError(error)
2025-12-04T13:21:31.3811588Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3811657Z Traceback (most recent call last):
2025-12-04T13:21:31.3811829Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3811893Z     getattr(self, test_name)()
2025-12-04T13:21:31.3812058Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3812137Z     fn()
2025-12-04T13:21:31.3812300Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3812364Z     method(*args, **kwargs)
2025-12-04T13:21:31.3812526Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3812584Z     method(*args, **kwargs)
2025-12-04T13:21:31.3812781Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3812829Z     with policy():
2025-12-04T13:21:31.3813004Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3813067Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3813449Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3813453Z 
2025-12-04T13:21:31.3813552Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3813818Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3813821Z 
2025-12-04T13:21:31.3813919Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3813921Z 
2025-12-04T13:21:31.3813937Z 
2025-12-04T13:21:31.3814026Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3814137Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3814406Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-ab74cfc34851cb6b.xml -
2025-12-04T13:21:31.3814516Z =========================== short test summary info ============================
2025-12-04T13:21:31.3814782Z FAILED [39.5386s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3814856Z Traceback (most recent call last):
2025-12-04T13:21:31.3815031Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3815101Z     getattr(self, test_name)()
2025-12-04T13:21:31.3815279Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3815335Z     fn()
2025-12-04T13:21:31.3815506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3815572Z     method(*args, **kwargs)
2025-12-04T13:21:31.3815730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3815810Z     method(*args, **kwargs)
2025-12-04T13:21:31.3815972Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3816036Z     with policy():
2025-12-04T13:21:31.3816211Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3816257Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3816653Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3816656Z 
2025-12-04T13:21:31.3816742Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3817010Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3817012Z 
2025-12-04T13:21:31.3817109Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3817189Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3817273Z ====================== 1 failed, 18 deselected in 39.70s =======================
2025-12-04T13:21:31.3817341Z Got exit code 1
2025-12-04T13:21:31.3817554Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3817708Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.3817917Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1545c1c5fac9b58b.xml
2025-12-04T13:21:31.3817999Z ============================= test session starts ==============================
2025-12-04T13:21:31.3818180Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3818231Z cachedir: .pytest_cache
2025-12-04T13:21:31.3818477Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3818534Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3823610Z configfile: pytest.ini
2025-12-04T13:21:31.3823787Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3823864Z collecting ... collected 60 items / 5 deselected / 55 selected
2025-12-04T13:21:31.3823974Z stepcurrent: skipping 5 already run items.
2025-12-04T13:21:31.3824033Z Running 14 items in this shard
2025-12-04T13:21:31.3824036Z 
2025-12-04T13:21:31.3824361Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:05:04.342000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 535387
2025-12-04T13:21:31.3824516Z I1204 13:05:04.342000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 535388
2025-12-04T13:21:31.3824669Z I1204 13:05:04.343000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 535389
2025-12-04T13:21:31.3824820Z I1204 13:05:04.343000 535318 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 535390
2025-12-04T13:21:31.3825404Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3825445Z   _warn_cpu_init()
2025-12-04T13:21:31.3826014Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3826052Z   _warn_cpu_init()
2025-12-04T13:21:31.3826617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3826656Z   _warn_cpu_init()
2025-12-04T13:21:31.3827243Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3827282Z   _warn_cpu_init()
2025-12-04T13:21:31.3827577Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3827621Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3827767Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3827930Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3828279Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3828461Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3828773Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3828899Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3829176Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3829327Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3829604Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3829753Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3830031Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3830168Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3830446Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3830594Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3831090Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3831208Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3831403Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3831792Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3831907Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3832120Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3832286Z [rank2]:E1204 13:06:00.441000 535389 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3832326Z dist init r=2, world=4
2025-12-04T13:21:31.3832464Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3832624Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3832932Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3833097Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3833381Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3833505Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3833783Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3833930Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3834207Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3834354Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3834632Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3834767Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3835044Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3835195Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3835683Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3835807Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3836004Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3836376Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3836491Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3836703Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3836869Z [rank1]:E1204 13:06:00.452000 535388 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3836916Z dist init r=1, world=4
2025-12-04T13:21:31.3837064Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3837234Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3837521Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3837674Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3837959Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3838085Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3838401Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3838548Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3838822Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3838973Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3839251Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3839389Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3839666Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3839813Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3840320Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3840434Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3840631Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3841001Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3841115Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3841339Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3841525Z [rank3]:E1204 13:06:00.496000 535390 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3841564Z dist init r=3, world=4
2025-12-04T13:21:31.3841704Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3841865Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3842150Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3842306Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3842590Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3842714Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3842989Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3843137Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3843413Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3843561Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3843838Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3843975Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3844269Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3844419Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3844907Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3845022Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3845216Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3845594Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3845727Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3845937Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3846102Z [rank0]:E1204 13:06:00.512000 535387 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3846140Z dist init r=0, world=4
2025-12-04T13:21:31.3846481Z [rank0]:[W1204 13:06:00.571404868 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3846523Z FAILED [58.0537s] [  7%]
2025-12-04T13:21:31.3846526Z 
2025-12-04T13:21:31.3846586Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3846698Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.3846746Z Traceback (most recent call last):
2025-12-04T13:21:31.3846911Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3846954Z     self._join_processes(fn)
2025-12-04T13:21:31.3847127Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3847181Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3847361Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3847404Z     raise RuntimeError(error)
2025-12-04T13:21:31.3847489Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3847534Z Traceback (most recent call last):
2025-12-04T13:21:31.3847695Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3847737Z     getattr(self, test_name)()
2025-12-04T13:21:31.3847898Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3847933Z     fn()
2025-12-04T13:21:31.3848086Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3848128Z     method(*args, **kwargs)
2025-12-04T13:21:31.3848338Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3848378Z     method(*args, **kwargs)
2025-12-04T13:21:31.3848532Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3848570Z     with policy():
2025-12-04T13:21:31.3848725Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3848765Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3849131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3849133Z 
2025-12-04T13:21:31.3849209Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3849470Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3849486Z 
2025-12-04T13:21:31.3849576Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3849593Z 
2025-12-04T13:21:31.3849595Z 
2025-12-04T13:21:31.3849670Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3849760Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3849994Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-1545c1c5fac9b58b.xml -
2025-12-04T13:21:31.3850055Z =========================== short test summary info ============================
2025-12-04T13:21:31.3850316Z FAILED [58.0537s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.3850367Z Traceback (most recent call last):
2025-12-04T13:21:31.3850533Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3850575Z     getattr(self, test_name)()
2025-12-04T13:21:31.3850735Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3850768Z     fn()
2025-12-04T13:21:31.3850920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3850962Z     method(*args, **kwargs)
2025-12-04T13:21:31.3851112Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3851152Z     method(*args, **kwargs)
2025-12-04T13:21:31.3851302Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3851341Z     with policy():
2025-12-04T13:21:31.3851493Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3851534Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3851897Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3851900Z 
2025-12-04T13:21:31.3851973Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3852227Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3852229Z 
2025-12-04T13:21:31.3852317Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3852381Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3852445Z ======================= 1 failed, 5 deselected in 58.19s =======================
2025-12-04T13:21:31.3852482Z Got exit code 1
2025-12-04T13:21:31.3852521Z Retrying single test...
2025-12-04T13:21:31.3852710Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-cd73215e76dd89cf.xml
2025-12-04T13:21:31.3852770Z ============================= test session starts ==============================
2025-12-04T13:21:31.3852881Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3852923Z cachedir: .pytest_cache
2025-12-04T13:21:31.3853081Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3853128Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3853200Z configfile: pytest.ini
2025-12-04T13:21:31.3853366Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3853451Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3853688Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3853731Z Running 1 items in this shard
2025-12-04T13:21:31.3853733Z 
2025-12-04T13:21:31.3854054Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:06:04.930000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 535789
2025-12-04T13:21:31.3854210Z I1204 13:06:04.930000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 535790
2025-12-04T13:21:31.3854363Z I1204 13:06:04.931000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 535791
2025-12-04T13:21:31.3854515Z I1204 13:06:04.932000 535720 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 535792
2025-12-04T13:21:31.3855097Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3855136Z   _warn_cpu_init()
2025-12-04T13:21:31.3855708Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3855749Z   _warn_cpu_init()
2025-12-04T13:21:31.3856323Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3856360Z   _warn_cpu_init()
2025-12-04T13:21:31.3856924Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3856964Z   _warn_cpu_init()
2025-12-04T13:21:31.3857255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3857298Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3857441Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3857613Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3857930Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3858087Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3858404Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3858530Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3858808Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3858959Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3859235Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3859380Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3859656Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3859794Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3860073Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3860220Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3860723Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3860841Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3861037Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3861408Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3861521Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3861734Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3861909Z [rank1]:E1204 13:07:00.983000 535790 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3861971Z dist init r=1, world=4
2025-12-04T13:21:31.3862108Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3862268Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3862555Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3862710Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3862997Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3863123Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3863399Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3863547Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3863824Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3863972Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3864247Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3864384Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3864661Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3864819Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3865310Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3865427Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3865622Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3865992Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3866116Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3866336Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3866516Z [rank3]:E1204 13:07:00.985000 535792 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3866554Z dist init r=3, world=4
2025-12-04T13:21:31.3866692Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3866851Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3867140Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3867297Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3867582Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3867706Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3867981Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3868129Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3868445Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3868593Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3868868Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3869003Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3869293Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3869443Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3869934Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3870049Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3870244Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3870628Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3870763Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3870973Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3871136Z [rank2]:E1204 13:07:01.011000 535791 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3871174Z dist init r=2, world=4
2025-12-04T13:21:31.3871311Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3871472Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3871763Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3871917Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3872203Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3872327Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3872603Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3872752Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3873028Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3873175Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3873458Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3873596Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3873874Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3874022Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3874511Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3874636Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3874851Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3875230Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3875344Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3875554Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3875719Z [rank0]:E1204 13:07:01.019000 535789 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3875757Z dist init r=0, world=4
2025-12-04T13:21:31.3876096Z [rank0]:[W1204 13:07:01.983265094 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3876136Z FAILED [58.0582s] [100%]
2025-12-04T13:21:31.3876140Z 
2025-12-04T13:21:31.3876197Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3876309Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.3876354Z Traceback (most recent call last):
2025-12-04T13:21:31.3876519Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3876562Z     self._join_processes(fn)
2025-12-04T13:21:31.3876738Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3876793Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3876972Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3877015Z     raise RuntimeError(error)
2025-12-04T13:21:31.3877095Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3877140Z Traceback (most recent call last):
2025-12-04T13:21:31.3877302Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3877344Z     getattr(self, test_name)()
2025-12-04T13:21:31.3877513Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3877547Z     fn()
2025-12-04T13:21:31.3877705Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3877747Z     method(*args, **kwargs)
2025-12-04T13:21:31.3877899Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3877939Z     method(*args, **kwargs)
2025-12-04T13:21:31.3878091Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3878128Z     with policy():
2025-12-04T13:21:31.3878346Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3878387Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3878769Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3878805Z 
2025-12-04T13:21:31.3878881Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3879126Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3879128Z 
2025-12-04T13:21:31.3879217Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3879219Z 
2025-12-04T13:21:31.3879221Z 
2025-12-04T13:21:31.3879296Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3879386Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3879619Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-cd73215e76dd89cf.xml -
2025-12-04T13:21:31.3879682Z =========================== short test summary info ============================
2025-12-04T13:21:31.3879940Z FAILED [58.0582s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3879987Z Traceback (most recent call last):
2025-12-04T13:21:31.3880150Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3880192Z     getattr(self, test_name)()
2025-12-04T13:21:31.3880352Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3880387Z     fn()
2025-12-04T13:21:31.3880541Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3880582Z     method(*args, **kwargs)
2025-12-04T13:21:31.3880737Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3880776Z     method(*args, **kwargs)
2025-12-04T13:21:31.3880925Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3880961Z     with policy():
2025-12-04T13:21:31.3881115Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3881155Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3881533Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3881537Z 
2025-12-04T13:21:31.3881611Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3881857Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3881859Z 
2025-12-04T13:21:31.3881948Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3882009Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3882073Z ====================== 1 failed, 18 deselected in 58.19s =======================
2025-12-04T13:21:31.3882111Z Got exit code 1
2025-12-04T13:21:31.3882153Z Retrying single test...
2025-12-04T13:21:31.3882344Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80a41ceace54cce5.xml
2025-12-04T13:21:31.3882415Z ============================= test session starts ==============================
2025-12-04T13:21:31.3882537Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3882591Z cachedir: .pytest_cache
2025-12-04T13:21:31.3882749Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3882797Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3882837Z configfile: pytest.ini
2025-12-04T13:21:31.3883001Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3883076Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3883315Z stepcurrent: skipping 5 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3883361Z Running 1 items in this shard
2025-12-04T13:21:31.3883364Z 
2025-12-04T13:21:31.3883687Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda I1204 13:07:05.671000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 536191
2025-12-04T13:21:31.3883846Z I1204 13:07:05.672000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 536192
2025-12-04T13:21:31.3883998Z I1204 13:07:05.673000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 536193
2025-12-04T13:21:31.3884149Z I1204 13:07:05.673000 536122 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 536194
2025-12-04T13:21:31.3884730Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3884772Z   _warn_cpu_init()
2025-12-04T13:21:31.3885340Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3885387Z   _warn_cpu_init()
2025-12-04T13:21:31.3885684Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3885729Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3886304Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3886341Z   _warn_cpu_init()
2025-12-04T13:21:31.3886928Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3886988Z   _warn_cpu_init()
2025-12-04T13:21:31.3887132Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3887295Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3887585Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3887741Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3888030Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3888195Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3888477Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3888627Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3888907Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3889055Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3889333Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3889471Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3889753Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3889915Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3890408Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3890527Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3890725Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3891118Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3891248Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3891474Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3891641Z [rank1]:E1204 13:08:01.669000 536192 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3891679Z dist init r=1, world=4
2025-12-04T13:21:31.3891820Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3891981Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3892271Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3892427Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3892714Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3892838Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3893120Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3893270Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3893547Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3893695Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3893971Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3894119Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3894398Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3894550Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3895041Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 3. CUDA driver allocated memory was 2250244096 and is now 3762290688.
2025-12-04T13:21:31.3895157Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3895366Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3895748Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3895873Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3896083Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3896249Z [rank3]:E1204 13:08:01.680000 536194 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3896289Z dist init r=3, world=4
2025-12-04T13:21:31.3896427Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3896589Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3896876Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3897031Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3897316Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3897442Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3897722Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3897872Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3898194Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3898341Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3898633Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3898771Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3899049Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3899197Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3899701Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3812622336.
2025-12-04T13:21:31.3899833Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3900045Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3900418Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3900532Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3900746Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3900910Z [rank2]:E1204 13:08:01.684000 536193 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3900951Z dist init r=2, world=4
2025-12-04T13:21:31.3901087Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3901248Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3901535Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3901689Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3901975Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3902101Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3902381Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3902529Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3902817Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3902967Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3903243Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3903380Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3903656Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3903807Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3904308Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 0. CUDA driver allocated memory was 2453667840 and is now 3965714432.
2025-12-04T13:21:31.3904444Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3904642Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3905016Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3905131Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3905342Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3905508Z [rank0]:E1204 13:08:01.721000 536191 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3905546Z dist init r=0, world=4
2025-12-04T13:21:31.3905886Z [rank0]:[W1204 13:08:01.633961664 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3905927Z FAILED [57.8521s] [100%]
2025-12-04T13:21:31.3905929Z 
2025-12-04T13:21:31.3905986Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3906099Z _ TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.3906147Z Traceback (most recent call last):
2025-12-04T13:21:31.3906311Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3906355Z     self._join_processes(fn)
2025-12-04T13:21:31.3906529Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3906582Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3906761Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3906821Z     raise RuntimeError(error)
2025-12-04T13:21:31.3906903Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3906950Z Traceback (most recent call last):
2025-12-04T13:21:31.3907114Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3907157Z     getattr(self, test_name)()
2025-12-04T13:21:31.3907317Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3907351Z     fn()
2025-12-04T13:21:31.3907504Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3907545Z     method(*args, **kwargs)
2025-12-04T13:21:31.3907697Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3907738Z     method(*args, **kwargs)
2025-12-04T13:21:31.3907900Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3907947Z     with policy():
2025-12-04T13:21:31.3908101Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3908197Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3908562Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3908564Z 
2025-12-04T13:21:31.3908641Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3908886Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3908888Z 
2025-12-04T13:21:31.3908979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3908982Z 
2025-12-04T13:21:31.3908984Z 
2025-12-04T13:21:31.3909059Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3909149Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3909385Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-80a41ceace54cce5.xml -
2025-12-04T13:21:31.3909448Z =========================== short test summary info ============================
2025-12-04T13:21:31.3909711Z FAILED [57.8521s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.3909757Z Traceback (most recent call last):
2025-12-04T13:21:31.3909922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3909965Z     getattr(self, test_name)()
2025-12-04T13:21:31.3910128Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3910162Z     fn()
2025-12-04T13:21:31.3910314Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3910354Z     method(*args, **kwargs)
2025-12-04T13:21:31.3910506Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3910545Z     method(*args, **kwargs)
2025-12-04T13:21:31.3910709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3910747Z     with policy():
2025-12-04T13:21:31.3910901Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3910956Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3911364Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 1. CUDA driver allocated memory was 2317352960 and is now 3829399552.
2025-12-04T13:21:31.3911366Z 
2025-12-04T13:21:31.3911442Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3911688Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3911690Z 
2025-12-04T13:21:31.3911780Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3911858Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3911937Z ====================== 1 failed, 18 deselected in 57.99s =======================
2025-12-04T13:21:31.3911989Z Got exit code 1
2025-12-04T13:21:31.3912183Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.3912311Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.3912502Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09c520c1ae6de888.xml
2025-12-04T13:21:31.3912560Z ============================= test session starts ==============================
2025-12-04T13:21:31.3912674Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3912715Z cachedir: .pytest_cache
2025-12-04T13:21:31.3912876Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3912924Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3912966Z configfile: pytest.ini
2025-12-04T13:21:31.3913128Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3913204Z collecting ... collected 60 items / 6 deselected / 54 selected
2025-12-04T13:21:31.3913257Z stepcurrent: skipping 6 already run items.
2025-12-04T13:21:31.3913301Z Running 13 items in this shard
2025-12-04T13:21:31.3913303Z 
2025-12-04T13:21:31.3913618Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:08:05.974000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 536593
2025-12-04T13:21:31.3913774Z I1204 13:08:05.975000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 536594
2025-12-04T13:21:31.3913928Z I1204 13:08:05.975000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 536595
2025-12-04T13:21:31.3914080Z I1204 13:08:05.976000 536524 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 536596
2025-12-04T13:21:31.3914661Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3914708Z   _warn_cpu_init()
2025-12-04T13:21:31.3915278Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3915319Z   _warn_cpu_init()
2025-12-04T13:21:31.3915883Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3915922Z   _warn_cpu_init()
2025-12-04T13:21:31.3916497Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3916554Z   _warn_cpu_init()
2025-12-04T13:21:31.3917051Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3917117Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3917608Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3917670Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3918198Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3918258Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3918749Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3918810Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3919104Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3919190Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3919694Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3919753Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3920046Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3920126Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3920416Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3920511Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3920808Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3920907Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3921404Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3921464Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3921950Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3922011Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3922297Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3922377Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3922665Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3922744Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3923032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3923107Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3923392Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3923465Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3924775Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3924906Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3925148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3925211Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3926491Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3926616Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3927884Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3928008Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3928284Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3928328Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3928565Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3928607Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3929881Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3930016Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3930265Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3930307Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3930528Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3930569Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3930789Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3930830Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3931050Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3931093Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3931315Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3931355Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3931646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3931687Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3931833Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3931998Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3932291Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3932448Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3932735Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3932861Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3933152Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3933303Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3933587Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3933737Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3934014Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3934164Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3934450Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3934611Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3935094Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712.
2025-12-04T13:21:31.3935211Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3935409Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3935776Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3935893Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3936106Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3936273Z [rank0]:E1204 13:08:14.507000 536593 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3936312Z dist init r=0, world=4
2025-12-04T13:21:31.3936453Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3936614Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3936902Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3937058Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3937351Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3937479Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3937755Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3937904Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3938224Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3938376Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3938667Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3938833Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3939111Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3939259Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3939742Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832.
2025-12-04T13:21:31.3939858Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3940054Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3940416Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3940532Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3940747Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3940912Z [rank1]:E1204 13:08:14.510000 536594 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3940951Z dist init r=1, world=4
2025-12-04T13:21:31.3941088Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3941249Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3941550Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3941706Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3941991Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3942116Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3942392Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3942540Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3942825Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3942992Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3943268Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3943405Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3943682Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3943832Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3944312Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616.
2025-12-04T13:21:31.3944428Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3944623Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3944985Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3945101Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3945313Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3945478Z [rank2]:E1204 13:08:14.519000 536595 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3945516Z dist init r=2, world=4
2025-12-04T13:21:31.3945655Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3945823Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3946111Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3946266Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3946551Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3946675Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3946951Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3947119Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3947403Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3947553Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3947830Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3947969Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3948295Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3948446Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3948924Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.3949038Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3949236Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3949597Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3949711Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3949924Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3950101Z [rank3]:E1204 13:08:14.554000 536596 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3950141Z dist init r=3, world=4
2025-12-04T13:21:31.3950477Z [rank0]:[W1204 13:08:14.396009769 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3950808Z [rank1]:[W1204 13:08:14.442966570 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3951133Z [rank2]:[W1204 13:08:14.456009417 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3951474Z [rank3]:[W1204 13:08:14.501052750 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3951545Z FAILED [22.9246s] [  7%]
2025-12-04T13:21:31.3951547Z 
2025-12-04T13:21:31.3951605Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3951707Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___
2025-12-04T13:21:31.3951753Z Traceback (most recent call last):
2025-12-04T13:21:31.3951919Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3951962Z     self._join_processes(fn)
2025-12-04T13:21:31.3952137Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3952191Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3952373Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3952418Z     raise RuntimeError(error)
2025-12-04T13:21:31.3952501Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3952546Z Traceback (most recent call last):
2025-12-04T13:21:31.3952710Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3952752Z     getattr(self, test_name)()
2025-12-04T13:21:31.3952910Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3952945Z     fn()
2025-12-04T13:21:31.3953100Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3953140Z     method(*args, **kwargs)
2025-12-04T13:21:31.3953293Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3953336Z     method(*args, **kwargs)
2025-12-04T13:21:31.3953485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3953522Z     with policy():
2025-12-04T13:21:31.3953673Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3953714Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3954079Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712.
2025-12-04T13:21:31.3954081Z 
2025-12-04T13:21:31.3954158Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3954395Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3954399Z 
2025-12-04T13:21:31.3954487Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3954489Z 
2025-12-04T13:21:31.3954491Z 
2025-12-04T13:21:31.3954567Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.3954654Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.3954889Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-09c520c1ae6de888.xml -
2025-12-04T13:21:31.3954949Z =========================== short test summary info ============================
2025-12-04T13:21:31.3955217Z FAILED [22.9246s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.3955284Z Traceback (most recent call last):
2025-12-04T13:21:31.3955449Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3955491Z     getattr(self, test_name)()
2025-12-04T13:21:31.3955652Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3955686Z     fn()
2025-12-04T13:21:31.3955839Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3955878Z     method(*args, **kwargs)
2025-12-04T13:21:31.3956031Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3956070Z     method(*args, **kwargs)
2025-12-04T13:21:31.3956222Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3956260Z     with policy():
2025-12-04T13:21:31.3956413Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3956453Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3956814Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712.
2025-12-04T13:21:31.3956816Z 
2025-12-04T13:21:31.3956893Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3957129Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3957132Z 
2025-12-04T13:21:31.3957220Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3957284Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.3957346Z ======================= 1 failed, 6 deselected in 23.06s =======================
2025-12-04T13:21:31.3957383Z Got exit code 1
2025-12-04T13:21:31.3957424Z Retrying single test...
2025-12-04T13:21:31.3957611Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2fb1a3772346bf41.xml
2025-12-04T13:21:31.3957669Z ============================= test session starts ==============================
2025-12-04T13:21:31.3957792Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.3957835Z cachedir: .pytest_cache
2025-12-04T13:21:31.3957994Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.3958044Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.3958083Z configfile: pytest.ini
2025-12-04T13:21:31.3958289Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.3958365Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.3958591Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3958636Z Running 1 items in this shard
2025-12-04T13:21:31.3958638Z 
2025-12-04T13:21:31.3958946Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:08:31.443000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 537859
2025-12-04T13:21:31.3959135Z I1204 13:08:31.444000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 537860
2025-12-04T13:21:31.3959299Z I1204 13:08:31.445000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 537861
2025-12-04T13:21:31.3959450Z I1204 13:08:31.445000 537790 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 537862
2025-12-04T13:21:31.3960032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3960071Z   _warn_cpu_init()
2025-12-04T13:21:31.3960568Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3960630Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3961200Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3961237Z   _warn_cpu_init()
2025-12-04T13:21:31.3961733Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3961795Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3962372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3962412Z   _warn_cpu_init()
2025-12-04T13:21:31.3962973Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.3963011Z   _warn_cpu_init()
2025-12-04T13:21:31.3963304Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3963388Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3963892Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3963969Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3964260Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3964338Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3964625Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3964708Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3965199Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3965259Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3965745Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3965805Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3966290Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3966347Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3966648Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3966724Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3967011Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3967092Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3967375Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3967454Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.3967953Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.3968029Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.3968361Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3968437Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3968724Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.3968800Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.3970083Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3970208Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3970439Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3970484Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3971777Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3971902Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3972128Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3972170Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3973454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3973598Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3973825Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3973869Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3975133Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.3975256Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.3975482Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.3975522Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3975745Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3975785Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3976018Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3976058Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3976279Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3976320Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3976538Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.3976577Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3976869Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.3976909Z   return func(*args, **kwargs)
2025-12-04T13:21:31.3977055Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3977228Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3977538Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3977695Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3977980Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3978109Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3978502Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3978653Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3978928Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3979076Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3979352Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3979490Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3979767Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3979915Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3980408Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.3980528Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3980727Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3981091Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3981205Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3981418Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3981594Z [rank3]:E1204 13:08:40.282000 537862 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.3981659Z dist init r=3, world=4
2025-12-04T13:21:31.3981796Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3981956Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3982243Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3982398Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3982683Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3982810Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3983089Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3983235Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3983510Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3983660Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3983936Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3984074Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3984349Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3984507Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3984988Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832.
2025-12-04T13:21:31.3985106Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3985304Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3985665Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3985789Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3986009Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3986189Z [rank1]:E1204 13:08:40.335000 537860 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.3986228Z dist init r=1, world=4
2025-12-04T13:21:31.3986366Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3986525Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3986813Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3986970Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3987255Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3987379Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3987657Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3987806Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3988082Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3988276Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3988552Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3988686Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3988977Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3989127Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3989608Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616.
2025-12-04T13:21:31.3989722Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3989918Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3990291Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3990428Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3990639Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3990802Z [rank2]:E1204 13:08:40.366000 537861 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.3990842Z dist init r=2, world=4
2025-12-04T13:21:31.3990981Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.3991142Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.3991428Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3991583Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.3991868Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3991992Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.3992271Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3992420Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3992740Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3992917Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.3993207Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3993346Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.3993623Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3993772Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.3994251Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712.
2025-12-04T13:21:31.3994377Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3994584Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3994956Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3995070Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.3995280Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3995444Z [rank0]:E1204 13:08:40.373000 537859 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.3995483Z dist init r=0, world=4
2025-12-04T13:21:31.3995821Z [rank3]:[W1204 13:08:40.128034835 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3996151Z [rank1]:[W1204 13:08:40.362097494 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3996478Z [rank2]:[W1204 13:08:40.467382685 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3996805Z [rank0]:[W1204 13:08:40.503820870 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.3996847Z FAILED [23.0261s] [100%]
2025-12-04T13:21:31.3996849Z 
2025-12-04T13:21:31.3996906Z =================================== FAILURES ===================================
2025-12-04T13:21:31.3997007Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___
2025-12-04T13:21:31.3997054Z Traceback (most recent call last):
2025-12-04T13:21:31.3997218Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.3997263Z     self._join_processes(fn)
2025-12-04T13:21:31.3997445Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.3997503Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.3997682Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.3997728Z     raise RuntimeError(error)
2025-12-04T13:21:31.3997808Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.3997854Z Traceback (most recent call last):
2025-12-04T13:21:31.3998014Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.3998056Z     getattr(self, test_name)()
2025-12-04T13:21:31.3998249Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.3998283Z     fn()
2025-12-04T13:21:31.3998438Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3998505Z     method(*args, **kwargs)
2025-12-04T13:21:31.3998657Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.3998709Z     method(*args, **kwargs)
2025-12-04T13:21:31.3998860Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.3998896Z     with policy():
2025-12-04T13:21:31.3999049Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.3999089Z     raise RuntimeError(msg)
2025-12-04T13:21:31.3999449Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.3999451Z 
2025-12-04T13:21:31.3999528Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.3999764Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.3999767Z 
2025-12-04T13:21:31.3999855Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.3999858Z 
2025-12-04T13:21:31.3999860Z 
2025-12-04T13:21:31.3999934Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4000022Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4000257Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2fb1a3772346bf41.xml -
2025-12-04T13:21:31.4000317Z =========================== short test summary info ============================
2025-12-04T13:21:31.4000570Z FAILED [23.0261s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4000619Z Traceback (most recent call last):
2025-12-04T13:21:31.4000782Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4000825Z     getattr(self, test_name)()
2025-12-04T13:21:31.4000984Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4001019Z     fn()
2025-12-04T13:21:31.4001170Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4001222Z     method(*args, **kwargs)
2025-12-04T13:21:31.4001374Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4001415Z     method(*args, **kwargs)
2025-12-04T13:21:31.4001567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4001605Z     with policy():
2025-12-04T13:21:31.4001757Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4001798Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4002153Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.4002156Z 
2025-12-04T13:21:31.4002231Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4002477Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4002509Z 
2025-12-04T13:21:31.4002596Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4002660Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4002722Z ====================== 1 failed, 18 deselected in 23.16s =======================
2025-12-04T13:21:31.4002760Z Got exit code 1
2025-12-04T13:21:31.4002800Z Retrying single test...
2025-12-04T13:21:31.4002991Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e65debb59102de6.xml
2025-12-04T13:21:31.4003049Z ============================= test session starts ==============================
2025-12-04T13:21:31.4003162Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4003204Z cachedir: .pytest_cache
2025-12-04T13:21:31.4003364Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4003411Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4003452Z configfile: pytest.ini
2025-12-04T13:21:31.4003614Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4003690Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4003917Z stepcurrent: skipping 6 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4003961Z Running 1 items in this shard
2025-12-04T13:21:31.4003963Z 
2025-12-04T13:21:31.4004272Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda I1204 13:08:57.247000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 539125
2025-12-04T13:21:31.4004429Z I1204 13:08:57.248000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 539126
2025-12-04T13:21:31.4004581Z I1204 13:08:57.248000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 539127
2025-12-04T13:21:31.4004730Z I1204 13:08:57.249000 539056 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 539128
2025-12-04T13:21:31.4005330Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4005369Z   _warn_cpu_init()
2025-12-04T13:21:31.4005862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4005925Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4006509Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4006567Z   _warn_cpu_init()
2025-12-04T13:21:31.4007061Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4007120Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4007692Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4007729Z   _warn_cpu_init()
2025-12-04T13:21:31.4008400Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4008459Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4009028Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4009069Z   _warn_cpu_init()
2025-12-04T13:21:31.4009363Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4009450Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4009736Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4009833Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4010329Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4010389Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4010679Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4010756Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4011044Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4011149Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4011448Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4011527Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4012022Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4012081Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4012566Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4012626Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4012913Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4012988Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4013275Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4013356Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4013849Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4013906Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4014207Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4014280Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4015563Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4015702Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4015951Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4015996Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4017267Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4017391Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4017620Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4017663Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4018965Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4019102Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4019331Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4019374Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4020654Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4020800Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4021024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4021066Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4021287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4021327Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4021549Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4021591Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4021812Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4021853Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4022071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4022112Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4022404Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4022445Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4022592Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4022756Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4023048Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4023204Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4023498Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4023625Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4023903Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4024054Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4024331Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4024481Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4024765Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4024921Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4025198Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4025348Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4025832Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 0. CUDA driver allocated memory was 2453667840 and is now 17628659712.
2025-12-04T13:21:31.4025949Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4026147Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4026513Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4026628Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4026841Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4027008Z [rank0]:E1204 13:09:05.803000 539125 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4027048Z dist init r=0, world=4
2025-12-04T13:21:31.4027188Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4027348Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4027635Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4027804Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4028089Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4028253Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4028529Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4028679Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4028959Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4029134Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4029422Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4029558Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4029835Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4029985Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4030468Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.4030585Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4030779Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4031144Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4031260Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4031474Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4031638Z [rank3]:E1204 13:09:05.812000 539128 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4031776Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4031933Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4032234Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4032390Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4032673Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4032797Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4033073Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4033233Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4033519Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4033677Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4033952Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4034088Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4034365Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4034514Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4034994Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616.
2025-12-04T13:21:31.4035109Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4035305Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4035667Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4035782Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4035994Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4036157Z [rank2]:E1204 13:09:05.812000 539127 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4036197Z dist init r=3, world=4
2025-12-04T13:21:31.4036250Z dist init r=2, world=4
2025-12-04T13:21:31.4036389Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4036549Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4036836Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4036990Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4037274Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4037400Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4037683Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4037853Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4038131Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4038316Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4038592Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4038728Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4039009Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4039156Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4039636Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 1. CUDA driver allocated memory was 2317352960 and is now 17492344832.
2025-12-04T13:21:31.4039751Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4039947Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4040307Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4040420Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4040644Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4040809Z [rank1]:E1204 13:09:05.865000 539126 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4040849Z dist init r=1, world=4
2025-12-04T13:21:31.4041186Z [rank0]:[W1204 13:09:06.669658255 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4041516Z [rank3]:[W1204 13:09:06.670956054 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4041856Z [rank2]:[W1204 13:09:06.671696362 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4042194Z [rank1]:[W1204 13:09:06.871387068 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4042250Z FAILED [22.8248s] [100%]
2025-12-04T13:21:31.4042252Z 
2025-12-04T13:21:31.4042309Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4042411Z __ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda ___
2025-12-04T13:21:31.4042456Z Traceback (most recent call last):
2025-12-04T13:21:31.4042622Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4042665Z     self._join_processes(fn)
2025-12-04T13:21:31.4042843Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4042899Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4043078Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4043122Z     raise RuntimeError(error)
2025-12-04T13:21:31.4043204Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4043250Z Traceback (most recent call last):
2025-12-04T13:21:31.4043411Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4043454Z     getattr(self, test_name)()
2025-12-04T13:21:31.4043613Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4043647Z     fn()
2025-12-04T13:21:31.4043800Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4043843Z     method(*args, **kwargs)
2025-12-04T13:21:31.4043994Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4044034Z     method(*args, **kwargs)
2025-12-04T13:21:31.4044183Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4044221Z     with policy():
2025-12-04T13:21:31.4044371Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4044412Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4044781Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616.
2025-12-04T13:21:31.4044785Z 
2025-12-04T13:21:31.4044864Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4045101Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4045105Z 
2025-12-04T13:21:31.4045192Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4045195Z 
2025-12-04T13:21:31.4045255Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4045300Z Traceback (most recent call last):
2025-12-04T13:21:31.4045465Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4045507Z     getattr(self, test_name)()
2025-12-04T13:21:31.4045679Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4045732Z     fn()
2025-12-04T13:21:31.4045884Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4045923Z     method(*args, **kwargs)
2025-12-04T13:21:31.4046073Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4046113Z     method(*args, **kwargs)
2025-12-04T13:21:31.4046263Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4046299Z     with policy():
2025-12-04T13:21:31.4046452Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4046492Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4046848Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.4046852Z 
2025-12-04T13:21:31.4046925Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4047157Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4047159Z 
2025-12-04T13:21:31.4047247Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4047249Z 
2025-12-04T13:21:31.4047251Z 
2025-12-04T13:21:31.4047328Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4047418Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4047655Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2e65debb59102de6.xml -
2025-12-04T13:21:31.4047719Z =========================== short test summary info ============================
2025-12-04T13:21:31.4047969Z FAILED [22.8248s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4048016Z Traceback (most recent call last):
2025-12-04T13:21:31.4048213Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4048257Z     getattr(self, test_name)()
2025-12-04T13:21:31.4048438Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4048474Z     fn()
2025-12-04T13:21:31.4048627Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4048667Z     method(*args, **kwargs)
2025-12-04T13:21:31.4048819Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4048857Z     method(*args, **kwargs)
2025-12-04T13:21:31.4049008Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4049044Z     with policy():
2025-12-04T13:21:31.4049195Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4049235Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4049604Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 2. CUDA driver allocated memory was 2300575744 and is now 17475567616.
2025-12-04T13:21:31.4049635Z 
2025-12-04T13:21:31.4049709Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4049943Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4049945Z 
2025-12-04T13:21:31.4050032Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4050036Z 
2025-12-04T13:21:31.4050094Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4050138Z Traceback (most recent call last):
2025-12-04T13:21:31.4050301Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4050343Z     getattr(self, test_name)()
2025-12-04T13:21:31.4050503Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4050539Z     fn()
2025-12-04T13:21:31.4050689Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4050732Z     method(*args, **kwargs)
2025-12-04T13:21:31.4050881Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4050921Z     method(*args, **kwargs)
2025-12-04T13:21:31.4051070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4051107Z     with policy():
2025-12-04T13:21:31.4051258Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4051300Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4051652Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 215552 on device 3. CUDA driver allocated memory was 2250244096 and is now 17425235968.
2025-12-04T13:21:31.4051656Z 
2025-12-04T13:21:31.4051730Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4051962Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4051965Z 
2025-12-04T13:21:31.4052052Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4052117Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4052190Z ====================== 1 failed, 18 deselected in 22.96s =======================
2025-12-04T13:21:31.4052229Z Got exit code 1
2025-12-04T13:21:31.4052411Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda
2025-12-04T13:21:31.4052542Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4052730Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d0307ff0aa7f20f8.xml
2025-12-04T13:21:31.4052789Z ============================= test session starts ==============================
2025-12-04T13:21:31.4052900Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4052942Z cachedir: .pytest_cache
2025-12-04T13:21:31.4053101Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4053148Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4053188Z configfile: pytest.ini
2025-12-04T13:21:31.4053361Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4053454Z collecting ... collected 60 items / 7 deselected / 53 selected
2025-12-04T13:21:31.4053508Z stepcurrent: skipping 7 already run items.
2025-12-04T13:21:31.4053549Z Running 12 items in this shard
2025-12-04T13:21:31.4053551Z 
2025-12-04T13:21:31.4053858Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 13:09:22.599000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 540391
2025-12-04T13:21:31.4054015Z I1204 13:09:22.600000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 540392
2025-12-04T13:21:31.4054169Z I1204 13:09:22.600000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 540393
2025-12-04T13:21:31.4054323Z I1204 13:09:22.601000 540322 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 540394
2025-12-04T13:21:31.4054909Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4054949Z   _warn_cpu_init()
2025-12-04T13:21:31.4055250Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4055289Z   _init_core_state(
2025-12-04T13:21:31.4055783Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4055847Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4056429Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4056466Z   _warn_cpu_init()
2025-12-04T13:21:31.4056766Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4056804Z   _init_core_state(
2025-12-04T13:21:31.4057296Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4057357Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4057936Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4057994Z   _warn_cpu_init()
2025-12-04T13:21:31.4058337Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4058375Z   _init_core_state(
2025-12-04T13:21:31.4058865Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4058926Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4059499Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4059535Z   _warn_cpu_init()
2025-12-04T13:21:31.4060025Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4060083Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4060571Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4060628Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4060937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4060976Z   _init_core_state(
2025-12-04T13:21:31.4061462Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4061522Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4062011Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4062068Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4063363Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4063512Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4064788Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4064912Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4066186Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4066309Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4067587Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4067734Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4067963Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4068006Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4068268Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4068309Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4068534Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4068575Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4068798Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4068838Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4069062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4069103Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4069325Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4069367Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4069586Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4069628Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4069846Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4069886Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4070188Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4070229Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4070374Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4070540Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4070831Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4070988Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4071275Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4071412Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4071706Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4071869Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4072144Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4072292Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4072569Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4072707Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4072985Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4073134Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4073613Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4073731Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4073931Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4074291Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4074406Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4074628Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4074795Z [rank0]:E1204 13:09:31.214000 540391 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4074835Z dist init r=0, world=4
2025-12-04T13:21:31.4074974Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4075133Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4075422Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4075576Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4078127Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4078496Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4078778Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4078929Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4079207Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4079356Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4079632Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4079771Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4080048Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4080199Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4080680Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440.
2025-12-04T13:21:31.4080798Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4080997Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4081370Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4081486Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4081701Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4081865Z [rank1]:E1204 13:09:31.217000 540392 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4081905Z dist init r=1, world=4
2025-12-04T13:21:31.4082044Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4082204Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4082509Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4082687Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4082970Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4083096Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4083376Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4083524Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4083799Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4083946Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4084221Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4084356Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4084635Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4084784Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4085263Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576.
2025-12-04T13:21:31.4085378Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4085584Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4085942Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4086056Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4086268Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4086432Z [rank3]:E1204 13:09:31.265000 540394 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4086471Z dist init r=3, world=4
2025-12-04T13:21:31.4086609Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4086777Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4087082Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4087235Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4087519Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4087643Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4087924Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4088074Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4088383Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4088529Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4088805Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4088943Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4089220Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4089367Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4089856Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224.
2025-12-04T13:21:31.4089970Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4090168Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4090524Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4090637Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4090849Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4091026Z [rank2]:E1204 13:09:31.266000 540393 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4091096Z dist init r=2, world=4
2025-12-04T13:21:31.4091433Z [rank0]:[W1204 13:09:31.066178555 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4091761Z [rank1]:[W1204 13:09:31.077287465 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4092087Z [rank2]:[W1204 13:09:31.199268541 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4092415Z [rank3]:[W1204 13:09:31.211099768 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4092455Z FAILED [22.8268s] [  8%]
2025-12-04T13:21:31.4092459Z 
2025-12-04T13:21:31.4092518Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4092620Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____
2025-12-04T13:21:31.4092666Z Traceback (most recent call last):
2025-12-04T13:21:31.4092833Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4092877Z     self._join_processes(fn)
2025-12-04T13:21:31.4093052Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4093107Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4093287Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4093330Z     raise RuntimeError(error)
2025-12-04T13:21:31.4093411Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4093456Z Traceback (most recent call last):
2025-12-04T13:21:31.4093618Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4093659Z     getattr(self, test_name)()
2025-12-04T13:21:31.4093828Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4093864Z     fn()
2025-12-04T13:21:31.4094017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4094058Z     method(*args, **kwargs)
2025-12-04T13:21:31.4094210Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4094250Z     method(*args, **kwargs)
2025-12-04T13:21:31.4094400Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4094436Z     with policy():
2025-12-04T13:21:31.4094588Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4094629Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4094995Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4095007Z 
2025-12-04T13:21:31.4095084Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4095327Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4095329Z 
2025-12-04T13:21:31.4095419Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4095421Z 
2025-12-04T13:21:31.4095424Z 
2025-12-04T13:21:31.4095499Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4095587Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4095823Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d0307ff0aa7f20f8.xml -
2025-12-04T13:21:31.4095884Z =========================== short test summary info ============================
2025-12-04T13:21:31.4096131Z FAILED [22.8268s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4096178Z Traceback (most recent call last):
2025-12-04T13:21:31.4096343Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4096384Z     getattr(self, test_name)()
2025-12-04T13:21:31.4096544Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4096577Z     fn()
2025-12-04T13:21:31.4096730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4096769Z     method(*args, **kwargs)
2025-12-04T13:21:31.4096920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4096960Z     method(*args, **kwargs)
2025-12-04T13:21:31.4097110Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4097146Z     with policy():
2025-12-04T13:21:31.4097299Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4097338Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4097702Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4097705Z 
2025-12-04T13:21:31.4097779Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4098009Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4098013Z 
2025-12-04T13:21:31.4098101Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4098207Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4098270Z ======================= 1 failed, 7 deselected in 22.96s =======================
2025-12-04T13:21:31.4098306Z Got exit code 1
2025-12-04T13:21:31.4098346Z Retrying single test...
2025-12-04T13:21:31.4098536Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e1069ec4db63983.xml
2025-12-04T13:21:31.4098594Z ============================= test session starts ==============================
2025-12-04T13:21:31.4098723Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4098776Z cachedir: .pytest_cache
2025-12-04T13:21:31.4098934Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4098993Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4099033Z configfile: pytest.ini
2025-12-04T13:21:31.4099197Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4099273Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4099496Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4099540Z Running 1 items in this shard
2025-12-04T13:21:31.4099543Z 
2025-12-04T13:21:31.4099853Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 13:09:47.944000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 541657
2025-12-04T13:21:31.4100009Z I1204 13:09:47.945000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 541658
2025-12-04T13:21:31.4100162Z I1204 13:09:47.946000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 541659
2025-12-04T13:21:31.4100312Z I1204 13:09:47.946000 541588 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 541660
2025-12-04T13:21:31.4100900Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4100939Z   _warn_cpu_init()
2025-12-04T13:21:31.4101238Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4101276Z   _init_core_state(
2025-12-04T13:21:31.4101790Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4101854Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4102432Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4102471Z   _warn_cpu_init()
2025-12-04T13:21:31.4102765Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4102801Z   _init_core_state(
2025-12-04T13:21:31.4103306Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4103389Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4103957Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4103995Z   _warn_cpu_init()
2025-12-04T13:21:31.4104288Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4104326Z   _init_core_state(
2025-12-04T13:21:31.4104819Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4104878Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4105449Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4105486Z   _warn_cpu_init()
2025-12-04T13:21:31.4105974Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4106031Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4106531Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4106590Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4106885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4106922Z   _init_core_state(
2025-12-04T13:21:31.4107410Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4107467Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4107965Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4108050Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4109372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4109503Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4110776Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4110901Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4112187Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4112309Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4113588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4113734Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4113964Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4114008Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4114233Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4114275Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4114499Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4114539Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4114763Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4114803Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4115024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4115065Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4115283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4115322Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4115542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4115582Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4115814Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4115856Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4116147Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4116187Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4116332Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4116497Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4116791Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4116961Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4117265Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4117390Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4117668Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4117816Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4118093Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4118279Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4118555Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4118691Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4118970Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4119120Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4119601Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4119718Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4119931Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4120290Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4120407Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4120619Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4120784Z [rank0]:E1204 13:09:56.630000 541657 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4120822Z dist init r=0, world=4
2025-12-04T13:21:31.4120961Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4121120Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4121435Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4121601Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4121887Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4122011Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4122289Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4122438Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4122714Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4122861Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4123138Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4123274Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4123552Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4123703Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4124180Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576.
2025-12-04T13:21:31.4124303Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4124500Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4124856Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4124970Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4125182Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4125345Z [rank3]:E1204 13:09:56.633000 541660 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4125384Z dist init r=3, world=4
2025-12-04T13:21:31.4125545Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4125716Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4126006Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4126165Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4126451Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4126577Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4126854Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4127000Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4127276Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4127422Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4127698Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4127834Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4128111Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4128304Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4128795Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224.
2025-12-04T13:21:31.4128911Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4129104Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4129459Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4129572Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4129799Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4129989Z [rank2]:E1204 13:09:56.634000 541659 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4130027Z dist init r=2, world=4
2025-12-04T13:21:31.4130166Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4130324Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4130611Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4130765Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4131050Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4131174Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4131449Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4131597Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4131874Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4132023Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4132296Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4132432Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4132716Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4132866Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4133345Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440.
2025-12-04T13:21:31.4133459Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4133654Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4134016Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4134139Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4134360Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4134524Z [rank1]:E1204 13:09:56.678000 541658 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4134563Z dist init r=1, world=4
2025-12-04T13:21:31.4134900Z [rank0]:[W1204 13:09:56.483041909 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4135230Z [rank3]:[W1204 13:09:56.497385556 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4135561Z [rank2]:[W1204 13:09:56.499735778 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4135888Z [rank1]:[W1204 13:09:56.621393473 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4135927Z FAILED [23.0278s] [100%]
2025-12-04T13:21:31.4135931Z 
2025-12-04T13:21:31.4135989Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4136091Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____
2025-12-04T13:21:31.4136138Z Traceback (most recent call last):
2025-12-04T13:21:31.4136303Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4136346Z     self._join_processes(fn)
2025-12-04T13:21:31.4136519Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4136572Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4136750Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4136792Z     raise RuntimeError(error)
2025-12-04T13:21:31.4136885Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4136929Z Traceback (most recent call last):
2025-12-04T13:21:31.4137091Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4137133Z     getattr(self, test_name)()
2025-12-04T13:21:31.4137292Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4137326Z     fn()
2025-12-04T13:21:31.4137478Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4137517Z     method(*args, **kwargs)
2025-12-04T13:21:31.4137669Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4137709Z     method(*args, **kwargs)
2025-12-04T13:21:31.4137861Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4137898Z     with policy():
2025-12-04T13:21:31.4138071Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4138125Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4138508Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4138511Z 
2025-12-04T13:21:31.4138586Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4138815Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4138818Z 
2025-12-04T13:21:31.4138907Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4138910Z 
2025-12-04T13:21:31.4138971Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4139017Z Traceback (most recent call last):
2025-12-04T13:21:31.4139180Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4139221Z     getattr(self, test_name)()
2025-12-04T13:21:31.4139378Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4139414Z     fn()
2025-12-04T13:21:31.4139564Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4139605Z     method(*args, **kwargs)
2025-12-04T13:21:31.4139754Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4139795Z     method(*args, **kwargs)
2025-12-04T13:21:31.4139944Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4139983Z     with policy():
2025-12-04T13:21:31.4140134Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4140175Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4140525Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224.
2025-12-04T13:21:31.4140528Z 
2025-12-04T13:21:31.4140601Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4140854Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4140858Z 
2025-12-04T13:21:31.4140945Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4140948Z 
2025-12-04T13:21:31.4141008Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4141052Z Traceback (most recent call last):
2025-12-04T13:21:31.4141216Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4141257Z     getattr(self, test_name)()
2025-12-04T13:21:31.4141415Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4141448Z     fn()
2025-12-04T13:21:31.4141600Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4141638Z     method(*args, **kwargs)
2025-12-04T13:21:31.4141802Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4141854Z     method(*args, **kwargs)
2025-12-04T13:21:31.4142017Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4142054Z     with policy():
2025-12-04T13:21:31.4142205Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4142245Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4142595Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576.
2025-12-04T13:21:31.4142598Z 
2025-12-04T13:21:31.4142671Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4142898Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4142902Z 
2025-12-04T13:21:31.4142989Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4142991Z 
2025-12-04T13:21:31.4142993Z 
2025-12-04T13:21:31.4143069Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4143156Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4143389Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6e1069ec4db63983.xml -
2025-12-04T13:21:31.4143448Z =========================== short test summary info ============================
2025-12-04T13:21:31.4143695Z FAILED [23.0278s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4143741Z Traceback (most recent call last):
2025-12-04T13:21:31.4143907Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4143948Z     getattr(self, test_name)()
2025-12-04T13:21:31.4144106Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4144139Z     fn()
2025-12-04T13:21:31.4144289Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4144327Z     method(*args, **kwargs)
2025-12-04T13:21:31.4144488Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4144526Z     method(*args, **kwargs)
2025-12-04T13:21:31.4144679Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4144717Z     with policy():
2025-12-04T13:21:31.4144868Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4144908Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4145257Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4145259Z 
2025-12-04T13:21:31.4145332Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4145558Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4145560Z 
2025-12-04T13:21:31.4145667Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4145681Z 
2025-12-04T13:21:31.4145739Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4145784Z Traceback (most recent call last):
2025-12-04T13:21:31.4145945Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4145988Z     getattr(self, test_name)()
2025-12-04T13:21:31.4146146Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4146180Z     fn()
2025-12-04T13:21:31.4146329Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4146369Z     method(*args, **kwargs)
2025-12-04T13:21:31.4146519Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4146560Z     method(*args, **kwargs)
2025-12-04T13:21:31.4146709Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4146746Z     with policy():
2025-12-04T13:21:31.4146897Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4146937Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4147286Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224.
2025-12-04T13:21:31.4147288Z 
2025-12-04T13:21:31.4147361Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4147588Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4147592Z 
2025-12-04T13:21:31.4147679Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4147681Z 
2025-12-04T13:21:31.4147739Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4147784Z Traceback (most recent call last):
2025-12-04T13:21:31.4147946Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4147987Z     getattr(self, test_name)()
2025-12-04T13:21:31.4148179Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4148227Z     fn()
2025-12-04T13:21:31.4148378Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4148419Z     method(*args, **kwargs)
2025-12-04T13:21:31.4148569Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4148609Z     method(*args, **kwargs)
2025-12-04T13:21:31.4148758Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4148795Z     with policy():
2025-12-04T13:21:31.4148945Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4148986Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4149337Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576.
2025-12-04T13:21:31.4149339Z 
2025-12-04T13:21:31.4149441Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4149679Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4149681Z 
2025-12-04T13:21:31.4149767Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4149830Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4149895Z ====================== 1 failed, 18 deselected in 23.16s =======================
2025-12-04T13:21:31.4149933Z Got exit code 1
2025-12-04T13:21:31.4149973Z Retrying single test...
2025-12-04T13:21:31.4150162Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0036e35144f9d74b.xml
2025-12-04T13:21:31.4150221Z ============================= test session starts ==============================
2025-12-04T13:21:31.4150336Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4150377Z cachedir: .pytest_cache
2025-12-04T13:21:31.4150538Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4150583Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4150625Z configfile: pytest.ini
2025-12-04T13:21:31.4150788Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4150863Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4151084Z stepcurrent: skipping 7 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4151128Z Running 1 items in this shard
2025-12-04T13:21:31.4151130Z 
2025-12-04T13:21:31.4151435Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda I1204 13:10:13.383000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 542923
2025-12-04T13:21:31.4151592Z I1204 13:10:13.384000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 542924
2025-12-04T13:21:31.4151745Z I1204 13:10:13.385000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 542925
2025-12-04T13:21:31.4151895Z I1204 13:10:13.386000 542854 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 542926
2025-12-04T13:21:31.4152493Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4152532Z   _warn_cpu_init()
2025-12-04T13:21:31.4152830Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4152866Z   _init_core_state(
2025-12-04T13:21:31.4153363Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4153435Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4154023Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4154079Z   _warn_cpu_init()
2025-12-04T13:21:31.4154372Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4154410Z   _init_core_state(
2025-12-04T13:21:31.4154900Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4154962Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4155533Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4155569Z   _warn_cpu_init()
2025-12-04T13:21:31.4155862Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4155900Z   _init_core_state(
2025-12-04T13:21:31.4156393Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4156452Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4157033Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4157072Z   _warn_cpu_init()
2025-12-04T13:21:31.4157559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4157617Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4158110Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4158218Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4158512Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4158548Z   _init_core_state(
2025-12-04T13:21:31.4159040Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4159098Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4159583Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4159641Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4160922Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4161048Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4162335Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4162459Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4163737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4163885Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4165154Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/autograd/graph.py:865: UserWarning: The AccumulateGrad node's stream does not match the stream of the node that produced the incoming gradient. This may incur unnecessary synchronization and break CUDA graph capture if the AccumulateGrad node's stream is the default stream. This mismatch is caused by an AccumulateGrad node created prior to the current iteration being kept alive. This can happen if the autograd graph is still being kept alive by tensors such as the loss, or if you are using DDP, which will stash a reference to the node. To resolve the mismatch, delete all references to the autograd graph or ensure that DDP initialization is performed under the same stream as subsequent forwards. If the mismatch is intentional, you can use torch.autograd.graph.set_warn_on_accumulate_grad_stream_mismatch(False) to suppress this warning. (Triggered internally at /var/lib/jenkins/workspace/torch/csrc/autograd/input_buffer.cpp:240.)
2025-12-04T13:21:31.4165276Z   return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
2025-12-04T13:21:31.4165505Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4165548Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4165770Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4165812Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4166034Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4166084Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4166307Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4166350Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4166570Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4166611Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4166830Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4166870Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4167090Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4167130Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4167361Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4167420Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4167712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4167751Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4167896Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4168059Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4168383Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4168540Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4168829Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4168954Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4169233Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4169384Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4169661Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4169810Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4170085Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4170223Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4170514Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4170667Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4171149Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4171264Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4171461Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4171829Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4171975Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4172187Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4172353Z [rank0]:E1204 13:10:22.072000 542923 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4172392Z dist init r=0, world=4
2025-12-04T13:21:31.4172532Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4172692Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4172980Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4173135Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4173421Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4173547Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4173822Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4173972Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4174247Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4174393Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4174680Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4174818Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4175096Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4175244Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4175723Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 2. CUDA driver allocated memory was 2300575744 and is now 17483956224.
2025-12-04T13:21:31.4175848Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4176061Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4176417Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4176530Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4176743Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4176907Z [rank2]:E1204 13:10:22.083000 542925 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4176947Z dist init r=2, world=4
2025-12-04T13:21:31.4177087Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4177246Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4177532Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4177685Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4177974Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4178099Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4178412Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4178559Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4178849Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4178997Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4179273Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4179410Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4179686Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4179835Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4180324Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 1. CUDA driver allocated memory was 2317352960 and is now 17500733440.
2025-12-04T13:21:31.4180463Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4180659Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4181013Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4181127Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4181340Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4181505Z [rank1]:E1204 13:10:22.109000 542924 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4181543Z dist init r=1, world=4
2025-12-04T13:21:31.4181681Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4181839Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4182126Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4182283Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4182570Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4182695Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4182970Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4183129Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4183405Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4183554Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4183829Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4183965Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4184244Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4184411Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4184898Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 3. CUDA driver allocated memory was 2250244096 and is now 17433624576.
2025-12-04T13:21:31.4185014Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4185210Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4185565Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4185679Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4185889Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4186052Z [rank3]:E1204 13:10:22.133000 542926 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4186090Z dist init r=3, world=4
2025-12-04T13:21:31.4186427Z [rank0]:[W1204 13:10:22.920921713 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4186758Z [rank2]:[W1204 13:10:22.968009349 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4187087Z [rank1]:[W1204 13:10:22.053992346 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4187425Z [rank3]:[W1204 13:10:22.149778383 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4187467Z FAILED [22.8258s] [100%]
2025-12-04T13:21:31.4187469Z 
2025-12-04T13:21:31.4187528Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4187631Z ____ TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda _____
2025-12-04T13:21:31.4187676Z Traceback (most recent call last):
2025-12-04T13:21:31.4187842Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4187885Z     self._join_processes(fn)
2025-12-04T13:21:31.4188059Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4188113Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4188324Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4188368Z     raise RuntimeError(error)
2025-12-04T13:21:31.4188466Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4188524Z Traceback (most recent call last):
2025-12-04T13:21:31.4188699Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4188741Z     getattr(self, test_name)()
2025-12-04T13:21:31.4188899Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4188935Z     fn()
2025-12-04T13:21:31.4189086Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4189127Z     method(*args, **kwargs)
2025-12-04T13:21:31.4189278Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4189320Z     method(*args, **kwargs)
2025-12-04T13:21:31.4189470Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4189509Z     with policy():
2025-12-04T13:21:31.4189663Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4189704Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4190055Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4190058Z 
2025-12-04T13:21:31.4190134Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4190363Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4190366Z 
2025-12-04T13:21:31.4190455Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4190458Z 
2025-12-04T13:21:31.4190460Z 
2025-12-04T13:21:31.4190535Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4190622Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4190855Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0036e35144f9d74b.xml -
2025-12-04T13:21:31.4190916Z =========================== short test summary info ============================
2025-12-04T13:21:31.4191175Z FAILED [22.8258s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4191221Z Traceback (most recent call last):
2025-12-04T13:21:31.4191387Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4191430Z     getattr(self, test_name)()
2025-12-04T13:21:31.4191590Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4191624Z     fn()
2025-12-04T13:21:31.4191776Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4191816Z     method(*args, **kwargs)
2025-12-04T13:21:31.4191970Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4192010Z     method(*args, **kwargs)
2025-12-04T13:21:31.4192160Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4192198Z     with policy():
2025-12-04T13:21:31.4192358Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4192422Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4192772Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 117248 on device 0. CUDA driver allocated memory was 2453667840 and is now 17637048320.
2025-12-04T13:21:31.4192774Z 
2025-12-04T13:21:31.4192849Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4193077Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4193079Z 
2025-12-04T13:21:31.4193168Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4193232Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4193295Z ====================== 1 failed, 18 deselected in 22.96s =======================
2025-12-04T13:21:31.4193333Z Got exit code 1
2025-12-04T13:21:31.4193511Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda
2025-12-04T13:21:31.4193639Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4193828Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-81529582b53aae4e.xml
2025-12-04T13:21:31.4193886Z ============================= test session starts ==============================
2025-12-04T13:21:31.4193999Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4194041Z cachedir: .pytest_cache
2025-12-04T13:21:31.4194202Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4194249Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4194290Z configfile: pytest.ini
2025-12-04T13:21:31.4194452Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4194526Z collecting ... collected 60 items / 8 deselected / 52 selected
2025-12-04T13:21:31.4194579Z stepcurrent: skipping 8 already run items.
2025-12-04T13:21:31.4194621Z Running 11 items in this shard
2025-12-04T13:21:31.4194623Z 
2025-12-04T13:21:31.4194942Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:10:38.900000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 544189
2025-12-04T13:21:31.4195097Z I1204 13:10:38.901000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 544190
2025-12-04T13:21:31.4195252Z I1204 13:10:38.902000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 544191
2025-12-04T13:21:31.4195402Z I1204 13:10:38.902000 544120 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 544192
2025-12-04T13:21:31.4195984Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4196022Z   _warn_cpu_init()
2025-12-04T13:21:31.4196530Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4196612Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4197181Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4197220Z   _warn_cpu_init()
2025-12-04T13:21:31.4197786Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4197824Z   _warn_cpu_init()
2025-12-04T13:21:31.4198355Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4198416Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4198907Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4198968Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4199554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4199593Z   _warn_cpu_init()
2025-12-04T13:21:31.4199885Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4199972Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4200465Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4200522Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4200826Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4200921Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4201431Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4201487Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4201780Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4201860Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4202149Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4202227Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4202512Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4202592Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4203085Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4203145Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4203632Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4203688Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4203994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4204069Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4204356Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4204437Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4204724Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4204798Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4205089Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4205142Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4205379Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4205433Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4205655Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4205697Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4205917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4205959Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4206184Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4206227Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4206448Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4206488Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4206706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4206747Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4206967Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4207009Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4207229Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4207270Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4207416Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4207580Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4207873Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4208040Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4208390Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4208519Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4208798Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4208948Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4209225Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4209388Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4209675Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4209828Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4210105Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4210256Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4210742Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224.
2025-12-04T13:21:31.4210861Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4211059Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4211422Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4211540Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4211755Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4211920Z [rank1]:E1204 13:10:47.674000 544190 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4211959Z dist init r=1, world=4
2025-12-04T13:21:31.4212098Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4212257Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4212557Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4212713Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4213000Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4213125Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4213403Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4213552Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4213838Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4214004Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4214280Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4214416Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4214697Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4214846Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4215329Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360.
2025-12-04T13:21:31.4215446Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4215644Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4216005Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4216121Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4216334Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4216499Z [rank3]:E1204 13:10:47.676000 544192 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4216537Z dist init r=3, world=4
2025-12-04T13:21:31.4216687Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4216848Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4217136Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4217291Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4217577Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4217702Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4217992Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4218219Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4218495Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4218644Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4218920Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4219058Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4219337Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4219487Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4219967Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4220082Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4220280Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4220640Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4220755Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4220979Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4221145Z [rank0]:E1204 13:10:47.683000 544189 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4221184Z dist init r=0, world=4
2025-12-04T13:21:31.4221322Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4221483Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4221769Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4221924Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4222221Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4222360Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4222653Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4222802Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4223078Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4223225Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4223501Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4223638Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4223915Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4224063Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4224543Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008.
2025-12-04T13:21:31.4224660Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4224857Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4225218Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4225342Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4225555Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4225719Z [rank2]:E1204 13:10:47.684000 544191 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4225758Z dist init r=2, world=4
2025-12-04T13:21:31.4226096Z [rank1]:[W1204 13:10:47.508836314 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4226427Z [rank3]:[W1204 13:10:47.528499745 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4226768Z [rank0]:[W1204 13:10:47.552294980 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4227115Z [rank2]:[W1204 13:10:47.576013066 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4227156Z FAILED [23.0242s] [  9%]
2025-12-04T13:21:31.4227158Z 
2025-12-04T13:21:31.4227215Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4227318Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___
2025-12-04T13:21:31.4227364Z Traceback (most recent call last):
2025-12-04T13:21:31.4227530Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4227576Z     self._join_processes(fn)
2025-12-04T13:21:31.4227750Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4227804Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4227982Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4228026Z     raise RuntimeError(error)
2025-12-04T13:21:31.4228106Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4228206Z Traceback (most recent call last):
2025-12-04T13:21:31.4228368Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4228411Z     getattr(self, test_name)()
2025-12-04T13:21:31.4228569Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4228605Z     fn()
2025-12-04T13:21:31.4228757Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4228799Z     method(*args, **kwargs)
2025-12-04T13:21:31.4228950Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4228990Z     method(*args, **kwargs)
2025-12-04T13:21:31.4229139Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4229177Z     with policy():
2025-12-04T13:21:31.4229341Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4229383Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4229741Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224.
2025-12-04T13:21:31.4229746Z 
2025-12-04T13:21:31.4229821Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4230056Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4230059Z 
2025-12-04T13:21:31.4230146Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4230148Z 
2025-12-04T13:21:31.4230150Z 
2025-12-04T13:21:31.4230226Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4230314Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4230573Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-81529582b53aae4e.xml -
2025-12-04T13:21:31.4230646Z =========================== short test summary info ============================
2025-12-04T13:21:31.4230895Z FAILED [23.0242s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4230941Z Traceback (most recent call last):
2025-12-04T13:21:31.4231106Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4231148Z     getattr(self, test_name)()
2025-12-04T13:21:31.4231309Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4231344Z     fn()
2025-12-04T13:21:31.4231497Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4231539Z     method(*args, **kwargs)
2025-12-04T13:21:31.4231690Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4231730Z     method(*args, **kwargs)
2025-12-04T13:21:31.4231882Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4231919Z     with policy():
2025-12-04T13:21:31.4232071Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4232112Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4232468Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224.
2025-12-04T13:21:31.4232472Z 
2025-12-04T13:21:31.4232548Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4232781Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4232784Z 
2025-12-04T13:21:31.4232872Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4232934Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4232997Z ======================= 1 failed, 8 deselected in 23.16s =======================
2025-12-04T13:21:31.4233035Z Got exit code 1
2025-12-04T13:21:31.4233084Z Retrying single test...
2025-12-04T13:21:31.4233276Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b083c281fab1e433.xml
2025-12-04T13:21:31.4233335Z ============================= test session starts ==============================
2025-12-04T13:21:31.4233449Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4233490Z cachedir: .pytest_cache
2025-12-04T13:21:31.4233650Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4233695Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4233735Z configfile: pytest.ini
2025-12-04T13:21:31.4233898Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4233974Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4234213Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4234275Z Running 1 items in this shard
2025-12-04T13:21:31.4234287Z 
2025-12-04T13:21:31.4234596Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:11:04.310000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 545599
2025-12-04T13:21:31.4234753Z I1204 13:11:04.311000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 545600
2025-12-04T13:21:31.4234905Z I1204 13:11:04.312000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 545601
2025-12-04T13:21:31.4235057Z I1204 13:11:04.313000 545530 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 545602
2025-12-04T13:21:31.4235640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4235679Z   _warn_cpu_init()
2025-12-04T13:21:31.4236176Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4236238Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4236816Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4236855Z   _warn_cpu_init()
2025-12-04T13:21:31.4237360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4237421Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4237995Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4238034Z   _warn_cpu_init()
2025-12-04T13:21:31.4238562Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4238620Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4239203Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4239263Z   _warn_cpu_init()
2025-12-04T13:21:31.4239561Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4239647Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4240143Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4240202Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4240488Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4240571Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4240858Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4240940Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4241432Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4241490Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4241779Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4241871Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4242159Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4242236Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4242524Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4242597Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4243089Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4243168Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4243468Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4243512Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4243796Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4243876Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4244371Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4244431Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4244718Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4244791Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4245019Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4245062Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4245287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4245329Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4245552Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4245592Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4245813Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4245852Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4246073Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4246121Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4246342Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4246384Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4246603Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4246643Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4246861Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4246903Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4247047Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4247211Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4247522Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4247688Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4247973Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4248099Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4248418Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4248568Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4248846Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4248993Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4249271Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4249408Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4249688Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4249837Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4250319Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4250451Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4250648Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4251011Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4251126Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4251340Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4251507Z [rank0]:E1204 13:11:13.306000 545599 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4251545Z dist init r=0, world=4
2025-12-04T13:21:31.4251715Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4251886Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4252173Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4252327Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4252613Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4252737Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4253014Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4253161Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4253436Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4253585Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4253862Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4254001Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4254277Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4254425Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4254917Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360.
2025-12-04T13:21:31.4255034Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4255230Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4255588Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4255704Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4255927Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4256112Z [rank3]:E1204 13:11:13.307000 545602 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4256151Z dist init r=3, world=4
2025-12-04T13:21:31.4256288Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4256446Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4256733Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4256888Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4257172Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4257297Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4257573Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4257720Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4257998Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4258194Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4258471Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4258606Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4258896Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4259045Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4259523Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224.
2025-12-04T13:21:31.4259638Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4259833Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4260205Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4260343Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4260554Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4260719Z [rank1]:E1204 13:11:13.317000 545600 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4260757Z dist init r=1, world=4
2025-12-04T13:21:31.4260896Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4261055Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4261344Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4261499Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4261785Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4261909Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4262186Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4262337Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4262613Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4262761Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4263037Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4263185Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4263462Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4263613Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4264092Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008.
2025-12-04T13:21:31.4264206Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4264412Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4264789Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4264905Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4265114Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4265281Z [rank2]:E1204 13:11:13.357000 545601 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4265319Z dist init r=2, world=4
2025-12-04T13:21:31.4265657Z [rank3]:[W1204 13:11:13.152041512 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4265988Z [rank0]:[W1204 13:11:13.160596354 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4266314Z [rank1]:[W1204 13:11:13.177252135 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4266643Z [rank2]:[W1204 13:11:13.315506001 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4266684Z FAILED [23.2271s] [100%]
2025-12-04T13:21:31.4266687Z 
2025-12-04T13:21:31.4266746Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4266849Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___
2025-12-04T13:21:31.4266895Z Traceback (most recent call last):
2025-12-04T13:21:31.4267059Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4267103Z     self._join_processes(fn)
2025-12-04T13:21:31.4267293Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4267348Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4267531Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4267575Z     raise RuntimeError(error)
2025-12-04T13:21:31.4267657Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4267701Z Traceback (most recent call last):
2025-12-04T13:21:31.4267864Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4267905Z     getattr(self, test_name)()
2025-12-04T13:21:31.4268063Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4268097Z     fn()
2025-12-04T13:21:31.4268284Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4268325Z     method(*args, **kwargs)
2025-12-04T13:21:31.4268492Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4268544Z     method(*args, **kwargs)
2025-12-04T13:21:31.4268707Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4268745Z     with policy():
2025-12-04T13:21:31.4268897Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4268938Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4269293Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4269297Z 
2025-12-04T13:21:31.4269374Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4269609Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4269613Z 
2025-12-04T13:21:31.4269702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4269704Z 
2025-12-04T13:21:31.4269705Z 
2025-12-04T13:21:31.4269779Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4269869Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4270105Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-b083c281fab1e433.xml -
2025-12-04T13:21:31.4270167Z =========================== short test summary info ============================
2025-12-04T13:21:31.4270417Z FAILED [23.2271s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4270465Z Traceback (most recent call last):
2025-12-04T13:21:31.4270631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4270673Z     getattr(self, test_name)()
2025-12-04T13:21:31.4270833Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4270868Z     fn()
2025-12-04T13:21:31.4271021Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4271061Z     method(*args, **kwargs)
2025-12-04T13:21:31.4271226Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4271266Z     method(*args, **kwargs)
2025-12-04T13:21:31.4271417Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4271455Z     with policy():
2025-12-04T13:21:31.4271607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4271646Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4272000Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4272002Z 
2025-12-04T13:21:31.4272076Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4272313Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4272342Z 
2025-12-04T13:21:31.4272431Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4272506Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4272571Z ====================== 1 failed, 18 deselected in 23.37s =======================
2025-12-04T13:21:31.4272608Z Got exit code 1
2025-12-04T13:21:31.4272649Z Retrying single test...
2025-12-04T13:21:31.4272838Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ad311a424db7abe.xml
2025-12-04T13:21:31.4272896Z ============================= test session starts ==============================
2025-12-04T13:21:31.4273008Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4273050Z cachedir: .pytest_cache
2025-12-04T13:21:31.4273208Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4273256Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4273298Z configfile: pytest.ini
2025-12-04T13:21:31.4273462Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4273537Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4273762Z stepcurrent: skipping 8 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4273806Z Running 1 items in this shard
2025-12-04T13:21:31.4273808Z 
2025-12-04T13:21:31.4274117Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda I1204 13:11:30.385000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 547009
2025-12-04T13:21:31.4274272Z I1204 13:11:30.386000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 547010
2025-12-04T13:21:31.4274427Z I1204 13:11:30.386000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 547011
2025-12-04T13:21:31.4274579Z I1204 13:11:30.387000 546940 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 547012
2025-12-04T13:21:31.4275174Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4275212Z   _warn_cpu_init()
2025-12-04T13:21:31.4275708Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4275772Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4276345Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4276382Z   _warn_cpu_init()
2025-12-04T13:21:31.4276895Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4276965Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4277540Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4277580Z   _warn_cpu_init()
2025-12-04T13:21:31.4278068Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4278129Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4278752Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4278791Z   _warn_cpu_init()
2025-12-04T13:21:31.4279083Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4279166Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4279455Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4279536Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4280044Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4280104Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4280392Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4280472Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4281056Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4281128Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4281437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4281516Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4281800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4281876Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4282164Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4282239Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4282731Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4282789Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4283081Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4283124Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4283412Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4283493Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4283982Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4284041Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4284341Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4284418Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4284646Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4284689Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4284912Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4284954Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4285176Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4285218Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4285448Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4285507Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4285728Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4285767Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4285986Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4286026Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4286246Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4286286Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4286508Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4286548Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4286696Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4286859Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4287150Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4287305Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4287593Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4287721Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4287999Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4288193Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4288489Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4288639Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4288914Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4289053Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4289332Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4289482Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4289978Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008.
2025-12-04T13:21:31.4290118Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4290315Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4290676Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4290792Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4291006Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4291170Z [rank2]:E1204 13:11:39.122000 547011 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4291219Z dist init r=2, world=4
2025-12-04T13:21:31.4291397Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4291560Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4291848Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4292004Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4292289Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4292415Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4292703Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4292852Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4293130Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4293276Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4293552Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4293690Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4293980Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4294152Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4294630Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4294745Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4294942Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4295303Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4295419Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4295629Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4295795Z [rank0]:E1204 13:11:39.125000 547009 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4295834Z dist init r=0, world=4
2025-12-04T13:21:31.4295976Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4296137Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4296426Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4296579Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4296877Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4297001Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4297279Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4297429Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4297704Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4297853Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4298138Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4298329Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4298623Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4298773Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4299252Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360.
2025-12-04T13:21:31.4299369Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4299565Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4299923Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4300037Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4300249Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4300413Z [rank3]:E1204 13:11:39.127000 547012 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4300454Z dist init r=3, world=4
2025-12-04T13:21:31.4300591Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4300752Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4301039Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4301204Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4301489Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4301616Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4301892Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4302040Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4302317Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4302477Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4302775Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4302910Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4303189Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4303337Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4303814Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17483956224.
2025-12-04T13:21:31.4303931Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4304125Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4304486Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4304599Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4304812Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4304977Z [rank1]:E1204 13:11:39.187000 547010 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4305014Z dist init r=1, world=4
2025-12-04T13:21:31.4305355Z [rank2]:[W1204 13:11:39.969171858 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4305699Z [rank0]:[W1204 13:11:39.982246177 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4306027Z [rank3]:[W1204 13:11:39.990209488 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4306352Z [rank1]:[W1204 13:11:39.166457355 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4306393Z FAILED [23.0269s] [100%]
2025-12-04T13:21:31.4306395Z 
2025-12-04T13:21:31.4306453Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4306554Z ___ TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda ___
2025-12-04T13:21:31.4306611Z Traceback (most recent call last):
2025-12-04T13:21:31.4306785Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4306839Z     self._join_processes(fn)
2025-12-04T13:21:31.4307013Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4307067Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4307245Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4307289Z     raise RuntimeError(error)
2025-12-04T13:21:31.4307369Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4307415Z Traceback (most recent call last):
2025-12-04T13:21:31.4307578Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4307623Z     getattr(self, test_name)()
2025-12-04T13:21:31.4307782Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4307819Z     fn()
2025-12-04T13:21:31.4307969Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4308011Z     method(*args, **kwargs)
2025-12-04T13:21:31.4308191Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4308233Z     method(*args, **kwargs)
2025-12-04T13:21:31.4308383Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4308420Z     with policy():
2025-12-04T13:21:31.4308573Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4308615Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4308970Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4308973Z 
2025-12-04T13:21:31.4309048Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4309281Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4309283Z 
2025-12-04T13:21:31.4309387Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4309389Z 
2025-12-04T13:21:31.4309450Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4309497Z Traceback (most recent call last):
2025-12-04T13:21:31.4309662Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4309704Z     getattr(self, test_name)()
2025-12-04T13:21:31.4309863Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4309897Z     fn()
2025-12-04T13:21:31.4310049Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4310088Z     method(*args, **kwargs)
2025-12-04T13:21:31.4310238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4310277Z     method(*args, **kwargs)
2025-12-04T13:21:31.4310428Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4310493Z     with policy():
2025-12-04T13:21:31.4310644Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4310699Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4311050Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008.
2025-12-04T13:21:31.4311052Z 
2025-12-04T13:21:31.4311126Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4311358Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4311360Z 
2025-12-04T13:21:31.4311448Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4311452Z 
2025-12-04T13:21:31.4311509Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4311557Z Traceback (most recent call last):
2025-12-04T13:21:31.4311719Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4311762Z     getattr(self, test_name)()
2025-12-04T13:21:31.4311920Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4311954Z     fn()
2025-12-04T13:21:31.4312106Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4312144Z     method(*args, **kwargs)
2025-12-04T13:21:31.4312296Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4312335Z     method(*args, **kwargs)
2025-12-04T13:21:31.4312486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4312524Z     with policy():
2025-12-04T13:21:31.4312676Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4312716Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4313066Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360.
2025-12-04T13:21:31.4313068Z 
2025-12-04T13:21:31.4313141Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4313382Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4313385Z 
2025-12-04T13:21:31.4313472Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4313475Z 
2025-12-04T13:21:31.4313477Z 
2025-12-04T13:21:31.4313553Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4313641Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4313876Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9ad311a424db7abe.xml -
2025-12-04T13:21:31.4313937Z =========================== short test summary info ============================
2025-12-04T13:21:31.4314187Z FAILED [23.0269s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4314249Z Traceback (most recent call last):
2025-12-04T13:21:31.4314422Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4314475Z     getattr(self, test_name)()
2025-12-04T13:21:31.4314636Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4314672Z     fn()
2025-12-04T13:21:31.4314824Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4314865Z     method(*args, **kwargs)
2025-12-04T13:21:31.4315014Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4315055Z     method(*args, **kwargs)
2025-12-04T13:21:31.4315204Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4315244Z     with policy():
2025-12-04T13:21:31.4315395Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4315437Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4315790Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17620271104.
2025-12-04T13:21:31.4315793Z 
2025-12-04T13:21:31.4315865Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4316097Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4316099Z 
2025-12-04T13:21:31.4316185Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4316188Z 
2025-12-04T13:21:31.4316246Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4316292Z Traceback (most recent call last):
2025-12-04T13:21:31.4318535Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4318581Z     getattr(self, test_name)()
2025-12-04T13:21:31.4318752Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4318786Z     fn()
2025-12-04T13:21:31.4318939Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4318979Z     method(*args, **kwargs)
2025-12-04T13:21:31.4319159Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4319199Z     method(*args, **kwargs)
2025-12-04T13:21:31.4319352Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4319389Z     with policy():
2025-12-04T13:21:31.4319542Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4319583Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4319937Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17467179008.
2025-12-04T13:21:31.4319939Z 
2025-12-04T13:21:31.4320013Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4320257Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4320273Z 
2025-12-04T13:21:31.4320376Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4320378Z 
2025-12-04T13:21:31.4320436Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4320480Z Traceback (most recent call last):
2025-12-04T13:21:31.4320643Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4320685Z     getattr(self, test_name)()
2025-12-04T13:21:31.4320844Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4320878Z     fn()
2025-12-04T13:21:31.4321028Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4321068Z     method(*args, **kwargs)
2025-12-04T13:21:31.4321220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4321260Z     method(*args, **kwargs)
2025-12-04T13:21:31.4321410Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4321446Z     with policy():
2025-12-04T13:21:31.4321597Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4321637Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4321988Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17416847360.
2025-12-04T13:21:31.4321991Z 
2025-12-04T13:21:31.4322062Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4322296Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4322299Z 
2025-12-04T13:21:31.4322384Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4322449Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4322513Z ====================== 1 failed, 18 deselected in 23.16s =======================
2025-12-04T13:21:31.4322551Z Got exit code 1
2025-12-04T13:21:31.4322732Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda
2025-12-04T13:21:31.4322873Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4323066Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-edddab3c46c1b17a.xml
2025-12-04T13:21:31.4323125Z ============================= test session starts ==============================
2025-12-04T13:21:31.4323239Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4323280Z cachedir: .pytest_cache
2025-12-04T13:21:31.4323439Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4323485Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4323526Z configfile: pytest.ini
2025-12-04T13:21:31.4323690Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4323766Z collecting ... collected 60 items / 9 deselected / 51 selected
2025-12-04T13:21:31.4323818Z stepcurrent: skipping 9 already run items.
2025-12-04T13:21:31.4323861Z Running 10 items in this shard
2025-12-04T13:21:31.4323884Z 
2025-12-04T13:21:31.4324221Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 13:11:55.877000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 548419
2025-12-04T13:21:31.4324388Z I1204 13:11:55.878000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 548420
2025-12-04T13:21:31.4324539Z I1204 13:11:55.879000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 548421
2025-12-04T13:21:31.4324689Z I1204 13:11:55.879000 548350 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 548422
2025-12-04T13:21:31.4325274Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4325313Z   _warn_cpu_init()
2025-12-04T13:21:31.4325807Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4325869Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4326445Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4326485Z   _warn_cpu_init()
2025-12-04T13:21:31.4326974Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4327046Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4327615Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4327653Z   _warn_cpu_init()
2025-12-04T13:21:31.4328142Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4328238Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4328824Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4328885Z   _warn_cpu_init()
2025-12-04T13:21:31.4329177Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4329261Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4329752Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4329811Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4330096Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4330179Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4330466Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4330545Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4330832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4330910Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4331403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4331461Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4331771Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4331814Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4332101Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4332180Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4332669Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4332727Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4333022Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4333116Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4333401Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4333480Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4333970Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4334029Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4334316Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4334390Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4334619Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4334661Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4334886Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4334926Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4335148Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4335188Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4335408Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4335448Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4335670Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4335709Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4335940Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4335983Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4336201Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4336242Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4336459Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4336500Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4336645Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4336810Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4337111Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4337287Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4337572Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4337699Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4337979Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4338128Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4338443Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4338590Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4338865Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4339003Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4339280Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4339430Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4339942Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4340072Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4340270Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4340664Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4340779Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4340992Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4341157Z [rank1]:E1204 13:12:28.437000 548420 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4341195Z dist init r=1, world=4
2025-12-04T13:21:31.4341349Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4341531Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4341818Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4341971Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4342256Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4342381Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4342660Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4342809Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4343084Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4343232Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4343508Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4343646Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4343922Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4344070Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4344588Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4344704Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4344901Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4345289Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4345404Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4345628Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4345817Z [rank3]:E1204 13:12:28.438000 548422 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4345856Z dist init r=3, world=4
2025-12-04T13:21:31.4345993Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4346153Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4346440Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4346595Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4346879Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4347004Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4347283Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4347429Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4347707Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4347855Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4348130Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4348297Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4348587Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4348736Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4349243Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4349359Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4349554Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4349956Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4350093Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4350304Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4350468Z [rank2]:E1204 13:12:28.439000 548421 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4350506Z dist init r=2, world=4
2025-12-04T13:21:31.4350643Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4350802Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4351090Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4351246Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4351529Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4351652Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4351931Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4352079Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4352354Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4352500Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4352774Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4352920Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4353198Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4353348Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4353853Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4353968Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4354178Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4354588Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4354703Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4354913Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4355077Z [rank0]:E1204 13:12:28.456000 548419 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4355116Z dist init r=0, world=4
2025-12-04T13:21:31.4355455Z [rank2]:[W1204 13:12:28.290645818 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4355786Z [rank1]:[W1204 13:12:28.306756168 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4356112Z [rank3]:[W1204 13:12:28.309660682 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4356440Z [rank0]:[W1204 13:12:28.377501880 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4356482Z FAILED [46.8445s] [ 10%]
2025-12-04T13:21:31.4356485Z 
2025-12-04T13:21:31.4356542Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4356671Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _
2025-12-04T13:21:31.4356717Z Traceback (most recent call last):
2025-12-04T13:21:31.4356880Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4356922Z     self._join_processes(fn)
2025-12-04T13:21:31.4357105Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4357160Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4357340Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4357384Z     raise RuntimeError(error)
2025-12-04T13:21:31.4357465Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4357509Z Traceback (most recent call last):
2025-12-04T13:21:31.4357670Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4357711Z     getattr(self, test_name)()
2025-12-04T13:21:31.4357870Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4357904Z     fn()
2025-12-04T13:21:31.4358057Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4358107Z     method(*args, **kwargs)
2025-12-04T13:21:31.4358311Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4358364Z     method(*args, **kwargs)
2025-12-04T13:21:31.4358514Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4358551Z     with policy():
2025-12-04T13:21:31.4358704Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4358743Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4359131Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4359133Z 
2025-12-04T13:21:31.4359211Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4359474Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4359477Z 
2025-12-04T13:21:31.4359565Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4359568Z 
2025-12-04T13:21:31.4359626Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4359672Z Traceback (most recent call last):
2025-12-04T13:21:31.4359834Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4359876Z     getattr(self, test_name)()
2025-12-04T13:21:31.4360035Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4360070Z     fn()
2025-12-04T13:21:31.4360221Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4360263Z     method(*args, **kwargs)
2025-12-04T13:21:31.4360411Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4360451Z     method(*args, **kwargs)
2025-12-04T13:21:31.4360601Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4360637Z     with policy():
2025-12-04T13:21:31.4360789Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4360829Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4361228Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4361232Z 
2025-12-04T13:21:31.4361305Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4361566Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4361568Z 
2025-12-04T13:21:31.4361654Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4361656Z 
2025-12-04T13:21:31.4361659Z 
2025-12-04T13:21:31.4361735Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4361822Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4362070Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-edddab3c46c1b17a.xml -
2025-12-04T13:21:31.4362160Z =========================== short test summary info ============================
2025-12-04T13:21:31.4362437Z FAILED [46.8445s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4362483Z Traceback (most recent call last):
2025-12-04T13:21:31.4362645Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4362688Z     getattr(self, test_name)()
2025-12-04T13:21:31.4362848Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4362882Z     fn()
2025-12-04T13:21:31.4363033Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4363076Z     method(*args, **kwargs)
2025-12-04T13:21:31.4363226Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4363266Z     method(*args, **kwargs)
2025-12-04T13:21:31.4363415Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4363452Z     with policy():
2025-12-04T13:21:31.4363603Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4363644Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4364027Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4364032Z 
2025-12-04T13:21:31.4364104Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4364365Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4364367Z 
2025-12-04T13:21:31.4364453Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4364454Z 
2025-12-04T13:21:31.4364513Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4364557Z Traceback (most recent call last):
2025-12-04T13:21:31.4364730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4364771Z     getattr(self, test_name)()
2025-12-04T13:21:31.4364932Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4364966Z     fn()
2025-12-04T13:21:31.4365116Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4365155Z     method(*args, **kwargs)
2025-12-04T13:21:31.4365305Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4365343Z     method(*args, **kwargs)
2025-12-04T13:21:31.4365492Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4365529Z     with policy():
2025-12-04T13:21:31.4365681Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4365721Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4366114Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4366137Z 
2025-12-04T13:21:31.4366211Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4366468Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4366470Z 
2025-12-04T13:21:31.4366556Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4366620Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4366683Z ======================= 1 failed, 9 deselected in 46.99s =======================
2025-12-04T13:21:31.4366720Z Got exit code 1
2025-12-04T13:21:31.4366761Z Retrying single test...
2025-12-04T13:21:31.4366951Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-59232a91c35e498e.xml
2025-12-04T13:21:31.4367010Z ============================= test session starts ==============================
2025-12-04T13:21:31.4367122Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4367162Z cachedir: .pytest_cache
2025-12-04T13:21:31.4367321Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4367367Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4367408Z configfile: pytest.ini
2025-12-04T13:21:31.4367571Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4367646Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4367903Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4367948Z Running 1 items in this shard
2025-12-04T13:21:31.4367950Z 
2025-12-04T13:21:31.4368315Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 13:12:45.125000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 549829
2025-12-04T13:21:31.4368471Z I1204 13:12:45.126000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 549830
2025-12-04T13:21:31.4368636Z I1204 13:12:45.126000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 549831
2025-12-04T13:21:31.4368788Z I1204 13:12:45.127000 549760 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 549832
2025-12-04T13:21:31.4369370Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4369406Z   _warn_cpu_init()
2025-12-04T13:21:31.4369916Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4369994Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4370584Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4370621Z   _warn_cpu_init()
2025-12-04T13:21:31.4371112Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4371175Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4371744Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4371781Z   _warn_cpu_init()
2025-12-04T13:21:31.4372271Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4372331Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4372902Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4372937Z   _warn_cpu_init()
2025-12-04T13:21:31.4373242Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4373328Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4373616Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4373695Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4373978Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4374060Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4374559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4374640Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4374926Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4375006Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4375500Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4375558Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4375848Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4375889Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4376376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4376433Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4376721Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4376800Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4377084Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4377160Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4377452Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4377532Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4378023Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4378083Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4378410Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4378484Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4378733Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4378787Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4379025Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4379065Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4379287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4379327Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4379548Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4379589Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4379809Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4379850Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4380071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4380109Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4380328Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4380367Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4380586Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4380626Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4380772Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4380936Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4381226Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4381383Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4381680Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4381807Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4382085Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4382234Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4382512Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4382660Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4382944Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4383100Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4383378Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4383526Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4384038Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4384156Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4384351Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4384744Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4384860Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4385072Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4385237Z [rank3]:E1204 13:13:17.945000 549832 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4385276Z dist init r=3, world=4
2025-12-04T13:21:31.4385415Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4385574Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4385870Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4386023Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4386310Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4386434Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4386711Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4386858Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4387143Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4387300Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4387583Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4387719Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4387996Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4388184Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4388694Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4388809Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4389004Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4389395Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4389512Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4389723Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4389887Z [rank0]:E1204 13:13:17.961000 549829 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4389925Z dist init r=0, world=4
2025-12-04T13:21:31.4390062Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4390237Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4390524Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4390679Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4390962Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4391086Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4391362Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4391538Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4391831Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4391977Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4392252Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4392387Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4392664Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4392812Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4393318Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4393433Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4393628Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4394022Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4394134Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4394344Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4394522Z [rank1]:E1204 13:13:17.966000 549830 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4394561Z dist init r=1, world=4
2025-12-04T13:21:31.4394700Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4394859Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4395145Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4395297Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4395581Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4395713Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4396012Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4396159Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4396436Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4396584Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4396859Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4396996Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4397272Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4397419Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4397928Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4398042Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4398288Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4398679Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4398809Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4399019Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4399184Z [rank2]:E1204 13:13:18.005000 549831 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4399222Z dist init r=2, world=4
2025-12-04T13:21:31.4399557Z [rank3]:[W1204 13:13:18.835600518 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4399888Z [rank0]:[W1204 13:13:18.908379949 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4400236Z [rank1]:[W1204 13:13:18.913587296 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4400586Z [rank2]:[W1204 13:13:18.980650819 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4400626Z FAILED [47.0483s] [100%]
2025-12-04T13:21:31.4400628Z 
2025-12-04T13:21:31.4400686Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4400815Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _
2025-12-04T13:21:31.4400861Z Traceback (most recent call last):
2025-12-04T13:21:31.4401026Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4401069Z     self._join_processes(fn)
2025-12-04T13:21:31.4401243Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4401296Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4401474Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4401518Z     raise RuntimeError(error)
2025-12-04T13:21:31.4401599Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4401644Z Traceback (most recent call last):
2025-12-04T13:21:31.4401806Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4401847Z     getattr(self, test_name)()
2025-12-04T13:21:31.4402006Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4402040Z     fn()
2025-12-04T13:21:31.4402192Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4402231Z     method(*args, **kwargs)
2025-12-04T13:21:31.4402383Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4402421Z     method(*args, **kwargs)
2025-12-04T13:21:31.4402571Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4402606Z     with policy():
2025-12-04T13:21:31.4402769Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4402809Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4403195Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4403199Z 
2025-12-04T13:21:31.4403275Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4403537Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4403541Z 
2025-12-04T13:21:31.4403629Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4403631Z 
2025-12-04T13:21:31.4403633Z 
2025-12-04T13:21:31.4403709Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4403807Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4404048Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-59232a91c35e498e.xml -
2025-12-04T13:21:31.4404129Z =========================== short test summary info ============================
2025-12-04T13:21:31.4404406Z FAILED [47.0483s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4404451Z Traceback (most recent call last):
2025-12-04T13:21:31.4404616Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4404658Z     getattr(self, test_name)()
2025-12-04T13:21:31.4404818Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4404853Z     fn()
2025-12-04T13:21:31.4405006Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4405046Z     method(*args, **kwargs)
2025-12-04T13:21:31.4405200Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4405238Z     method(*args, **kwargs)
2025-12-04T13:21:31.4405388Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4405424Z     with policy():
2025-12-04T13:21:31.4405575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4405616Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4406000Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4406004Z 
2025-12-04T13:21:31.4406078Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4406340Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4406342Z 
2025-12-04T13:21:31.4406429Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4406491Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4406567Z ====================== 1 failed, 18 deselected in 47.19s =======================
2025-12-04T13:21:31.4406604Z Got exit code 1
2025-12-04T13:21:31.4406644Z Retrying single test...
2025-12-04T13:21:31.4406834Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d1df5502214bf3b9.xml
2025-12-04T13:21:31.4406893Z ============================= test session starts ==============================
2025-12-04T13:21:31.4407004Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4407045Z cachedir: .pytest_cache
2025-12-04T13:21:31.4407204Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4407250Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4407289Z configfile: pytest.ini
2025-12-04T13:21:31.4407453Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4407528Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4407791Z stepcurrent: skipping 9 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4407866Z Running 1 items in this shard
2025-12-04T13:21:31.4407868Z 
2025-12-04T13:21:31.4408238Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda I1204 13:13:34.730000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 551239
2025-12-04T13:21:31.4408392Z I1204 13:13:34.731000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 551240
2025-12-04T13:21:31.4408543Z I1204 13:13:34.731000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 551241
2025-12-04T13:21:31.4408693Z I1204 13:13:34.732000 551170 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 551242
2025-12-04T13:21:31.4409273Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4409311Z   _warn_cpu_init()
2025-12-04T13:21:31.4409808Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4409871Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4410440Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4410477Z   _warn_cpu_init()
2025-12-04T13:21:31.4410981Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4411042Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4411610Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4411647Z   _warn_cpu_init()
2025-12-04T13:21:31.4412153Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4412224Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4412805Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4412842Z   _warn_cpu_init()
2025-12-04T13:21:31.4413134Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4413218Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4413505Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4413586Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4414076Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4414134Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4414424Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4414504Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4414996Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4415053Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4415350Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4415429Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4415713Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4415790Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4416074Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4416146Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4416437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4416491Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4416994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4417063Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4417349Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:787: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4417429Z   shared = FSDP(shared, group, **fsdp_kwargs)  # type: ignore[assignment]
2025-12-04T13:21:31.4417916Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4417975Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4418291Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4418364Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4418592Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4418635Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4418859Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4418902Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4419121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4419162Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4419386Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4419426Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4419661Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4419703Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4419922Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4419962Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4420181Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4420219Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4420437Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4420476Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4420621Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4420797Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4421110Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4421265Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4421550Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4421677Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4421955Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4422105Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4422381Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4422529Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4422804Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4422942Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4423220Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4423368Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4423892Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4424010Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4424207Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4424598Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4424712Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4424925Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4425099Z [rank0]:E1204 13:14:07.404000 551239 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4425165Z dist init r=0, world=4
2025-12-04T13:21:31.4425302Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4425461Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4425746Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4425902Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4426188Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4426315Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4426593Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4426740Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4427017Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4427165Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4427441Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4427578Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4427854Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4428014Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4428560Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4428679Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4428875Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4429265Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4429407Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4429633Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4429797Z [rank1]:E1204 13:14:07.409000 551240 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4429835Z dist init r=1, world=4
2025-12-04T13:21:31.4429972Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4430131Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4430419Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4430573Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4430857Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4430983Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4431260Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4431408Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4431682Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4431829Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4432103Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4432253Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4432535Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4432684Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4433195Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4433309Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4433514Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4433914Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4434037Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] 
﻿2025-12-04T13:21:31.4436618Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4436783Z [rank3]:E1204 13:14:07.414000 551242 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4436823Z dist init r=3, world=4
2025-12-04T13:21:31.4436961Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4437121Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4437408Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4437562Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4437874Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4438000Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4438308Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4438458Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4438735Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4438882Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4439178Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4439315Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4439591Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4439738Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4440261Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4440378Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4440589Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4440976Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4441160Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4441372Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4441536Z [rank2]:E1204 13:14:07.467000 551241 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4441575Z dist init r=2, world=4
2025-12-04T13:21:31.4441911Z [rank0]:[W1204 13:14:07.240668545 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4442241Z [rank1]:[W1204 13:14:07.249826778 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4442568Z [rank3]:[W1204 13:14:07.273658546 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4442897Z [rank2]:[W1204 13:14:07.404744246 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4442939Z FAILED [46.8455s] [100%]
2025-12-04T13:21:31.4442941Z 
2025-12-04T13:21:31.4443001Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4443129Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda _
2025-12-04T13:21:31.4443175Z Traceback (most recent call last):
2025-12-04T13:21:31.4443356Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4443401Z     self._join_processes(fn)
2025-12-04T13:21:31.4443575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4443630Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4443807Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4443852Z     raise RuntimeError(error)
2025-12-04T13:21:31.4443931Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4443977Z Traceback (most recent call last):
2025-12-04T13:21:31.4444137Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4444181Z     getattr(self, test_name)()
2025-12-04T13:21:31.4444338Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4444385Z     fn()
2025-12-04T13:21:31.4444536Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4444588Z     method(*args, **kwargs)
2025-12-04T13:21:31.4444738Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4444778Z     method(*args, **kwargs)
2025-12-04T13:21:31.4444928Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4444982Z     with policy():
2025-12-04T13:21:31.4445134Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4445176Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4445562Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4445565Z 
2025-12-04T13:21:31.4445642Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4445906Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4445909Z 
2025-12-04T13:21:31.4445998Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4446000Z 
2025-12-04T13:21:31.4446002Z 
2025-12-04T13:21:31.4446079Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4446167Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4446402Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d1df5502214bf3b9.xml -
2025-12-04T13:21:31.4446465Z =========================== short test summary info ============================
2025-12-04T13:21:31.4446744Z FAILED [46.8455s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4446791Z Traceback (most recent call last):
2025-12-04T13:21:31.4446955Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4446998Z     getattr(self, test_name)()
2025-12-04T13:21:31.4447166Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4447201Z     fn()
2025-12-04T13:21:31.4447355Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4447396Z     method(*args, **kwargs)
2025-12-04T13:21:31.4447547Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4447587Z     method(*args, **kwargs)
2025-12-04T13:21:31.4447736Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4447775Z     with policy():
2025-12-04T13:21:31.4447925Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4447966Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4448398Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4448415Z 
2025-12-04T13:21:31.4448491Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4448752Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4448754Z 
2025-12-04T13:21:31.4448841Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4448919Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4448983Z ====================== 1 failed, 18 deselected in 46.98s =======================
2025-12-04T13:21:31.4449021Z Got exit code 1
2025-12-04T13:21:31.4449232Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda
2025-12-04T13:21:31.4449362Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4449551Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0a49c8ca17fcd339.xml
2025-12-04T13:21:31.4449611Z ============================= test session starts ==============================
2025-12-04T13:21:31.4449722Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4449765Z cachedir: .pytest_cache
2025-12-04T13:21:31.4449923Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4449970Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4450011Z configfile: pytest.ini
2025-12-04T13:21:31.4450175Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4450249Z collecting ... collected 60 items / 10 deselected / 50 selected
2025-12-04T13:21:31.4450304Z stepcurrent: skipping 10 already run items.
2025-12-04T13:21:31.4450348Z Running 9 items in this shard
2025-12-04T13:21:31.4450352Z 
2025-12-04T13:21:31.4450682Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 13:14:24.135000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 552649
2025-12-04T13:21:31.4450839Z I1204 13:14:24.136000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 552650
2025-12-04T13:21:31.4451002Z I1204 13:14:24.137000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 552651
2025-12-04T13:21:31.4451155Z I1204 13:14:24.137000 552580 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 552652
2025-12-04T13:21:31.4451738Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4451779Z   _warn_cpu_init()
2025-12-04T13:21:31.4452075Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4452113Z   _init_core_state(
2025-12-04T13:21:31.4452618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4452692Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4453265Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4453312Z   _warn_cpu_init()
2025-12-04T13:21:31.4453607Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4453646Z   _init_core_state(
2025-12-04T13:21:31.4454139Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4454202Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4454772Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4454810Z   _warn_cpu_init()
2025-12-04T13:21:31.4455102Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4455140Z   _init_core_state(
2025-12-04T13:21:31.4455640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4455699Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4456268Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4456306Z   _warn_cpu_init()
2025-12-04T13:21:31.4456796Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4456863Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4457356Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4457432Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4457720Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4457764Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4458056Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4458095Z   _init_core_state(
2025-12-04T13:21:31.4458611Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4458670Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4459159Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4459217Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4459447Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4459488Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4459712Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4459753Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4459989Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4460030Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4460252Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4460293Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4460513Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4460553Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4460773Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4460813Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4461032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4461083Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4461305Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4461361Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4461506Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4461669Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4461976Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4462134Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4462421Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4462547Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4462826Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4462974Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4463253Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4463401Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4463681Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4463818Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4464106Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4464256Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4464763Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4464881Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4465078Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4465473Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4465599Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4465809Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4465975Z [rank0]:E1204 13:14:57.048000 552649 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4466026Z dist init r=0, world=4
2025-12-04T13:21:31.4466165Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4466326Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4466616Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4466771Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4467054Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4467180Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4467457Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4467606Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4467881Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4468029Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4468344Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4468492Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4468777Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4468926Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4469429Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4469544Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4469751Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4470148Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4470262Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4470487Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4470653Z [rank3]:E1204 13:14:57.051000 552652 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4470691Z dist init r=3, world=4
2025-12-04T13:21:31.4470830Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4470993Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4471283Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4471438Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4471723Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4471847Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4472124Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4472272Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4472547Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4472706Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4472982Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4473120Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4473398Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4473548Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4474061Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4474194Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4474388Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4474771Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4474897Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4475108Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4475273Z [rank2]:E1204 13:14:57.071000 552651 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4475311Z dist init r=2, world=4
2025-12-04T13:21:31.4475449Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4475607Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4475896Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4476049Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4476335Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4476461Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4476736Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4476896Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4477172Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4477321Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4477596Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4477733Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4478013Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4478219Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4478721Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4478859Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4479054Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4479437Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4479551Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4479761Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4479924Z [rank1]:E1204 13:14:57.112000 552650 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4479964Z dist init r=1, world=4
2025-12-04T13:21:31.4480300Z [rank3]:[W1204 13:14:57.892706910 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4480632Z [rank0]:[W1204 13:14:57.893304251 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4480959Z [rank2]:[W1204 13:14:57.935290170 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4481299Z [rank1]:[W1204 13:14:57.059136509 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4481341Z FAILED [47.1405s] [ 11%]
2025-12-04T13:21:31.4481343Z 
2025-12-04T13:21:31.4481401Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4481525Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _
2025-12-04T13:21:31.4481572Z Traceback (most recent call last):
2025-12-04T13:21:31.4481736Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4481779Z     self._join_processes(fn)
2025-12-04T13:21:31.4481952Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4482007Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4482185Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4482229Z     raise RuntimeError(error)
2025-12-04T13:21:31.4482310Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4482365Z Traceback (most recent call last):
2025-12-04T13:21:31.4482527Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4482579Z     getattr(self, test_name)()
2025-12-04T13:21:31.4482738Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4482772Z     fn()
2025-12-04T13:21:31.4482923Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4482977Z     method(*args, **kwargs)
2025-12-04T13:21:31.4483128Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4483169Z     method(*args, **kwargs)
2025-12-04T13:21:31.4483319Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4483357Z     with policy():
2025-12-04T13:21:31.4483509Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4483550Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4483930Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4483933Z 
2025-12-04T13:21:31.4484008Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4484264Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4484267Z 
2025-12-04T13:21:31.4484358Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4484361Z 
2025-12-04T13:21:31.4484363Z 
2025-12-04T13:21:31.4484439Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4484527Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4484763Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-0a49c8ca17fcd339.xml -
2025-12-04T13:21:31.4484824Z =========================== short test summary info ============================
2025-12-04T13:21:31.4485108Z FAILED [47.1405s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4485154Z Traceback (most recent call last):
2025-12-04T13:21:31.4485319Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4485361Z     getattr(self, test_name)()
2025-12-04T13:21:31.4485521Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4485556Z     fn()
2025-12-04T13:21:31.4485707Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4485748Z     method(*args, **kwargs)
2025-12-04T13:21:31.4485900Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4485939Z     method(*args, **kwargs)
2025-12-04T13:21:31.4486090Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4486126Z     with policy():
2025-12-04T13:21:31.4486288Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4486338Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4486716Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4486730Z 
2025-12-04T13:21:31.4486805Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4487062Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4487064Z 
2025-12-04T13:21:31.4487152Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4487215Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4487282Z ====================== 1 failed, 10 deselected in 47.28s =======================
2025-12-04T13:21:31.4487319Z Got exit code 1
2025-12-04T13:21:31.4487360Z Retrying single test...
2025-12-04T13:21:31.4487549Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-52a6d20b2aed2cc3.xml
2025-12-04T13:21:31.4487608Z ============================= test session starts ==============================
2025-12-04T13:21:31.4487721Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4487762Z cachedir: .pytest_cache
2025-12-04T13:21:31.4487920Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4487966Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4488006Z configfile: pytest.ini
2025-12-04T13:21:31.4488215Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4488291Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4488540Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4488585Z Running 1 items in this shard
2025-12-04T13:21:31.4488588Z 
2025-12-04T13:21:31.4488931Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 13:15:13.982000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 554059
2025-12-04T13:21:31.4489087Z I1204 13:15:13.983000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 554060
2025-12-04T13:21:31.4489241Z I1204 13:15:13.984000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 554061
2025-12-04T13:21:31.4489393Z I1204 13:15:13.984000 553990 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 554062
2025-12-04T13:21:31.4489978Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4490017Z   _warn_cpu_init()
2025-12-04T13:21:31.4490334Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4490383Z   _init_core_state(
2025-12-04T13:21:31.4490876Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4490950Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4491522Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4491560Z   _warn_cpu_init()
2025-12-04T13:21:31.4491853Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4491890Z   _init_core_state(
2025-12-04T13:21:31.4492385Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4492447Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4493013Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4493051Z   _warn_cpu_init()
2025-12-04T13:21:31.4493345Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4493381Z   _init_core_state(
2025-12-04T13:21:31.4493881Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4493940Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4494510Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4494548Z   _warn_cpu_init()
2025-12-04T13:21:31.4495047Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4495116Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4495600Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4495672Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4496159Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4496216Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4496511Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4496549Z   _init_core_state(
2025-12-04T13:21:31.4497040Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4497099Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4497390Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4497432Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4497661Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4497704Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4497936Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4497978Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4498241Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4498283Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4498504Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4498544Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4498764Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4498805Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4499026Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4499079Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4499299Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4499354Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4499574Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4499630Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4499776Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4499939Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4500230Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4500385Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4500671Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4500797Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4501076Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4501226Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4501502Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4501651Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4501927Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4502078Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4502355Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4502506Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4503012Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4503130Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4503339Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4503722Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4503850Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4504071Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4504238Z [rank1]:E1204 13:15:46.978000 554060 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4504276Z dist init r=1, world=4
2025-12-04T13:21:31.4504417Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4504577Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4504866Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4505020Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4505306Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4505432Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4505708Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4505858Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4506134Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4506281Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4506574Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4506711Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4506994Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4507144Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4507655Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4507781Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4507976Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4508384Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4508522Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4508735Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4508899Z [rank3]:E1204 13:15:46.982000 554062 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4508939Z dist init r=3, world=4
2025-12-04T13:21:31.4509076Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4509238Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4509527Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4509680Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4509965Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4510089Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4510364Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4510511Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4510801Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4510949Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4511224Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4511360Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4511639Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4511789Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4512304Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4512432Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4512639Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4513023Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4513136Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4513348Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4513512Z [rank2]:E1204 13:15:47.012000 554061 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4513550Z dist init r=2, world=4
2025-12-04T13:21:31.4513688Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4513850Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4514139Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4514294Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4514576Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4514702Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4514987Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4515136Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4515412Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4515559Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4515834Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4515972Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4516262Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4516419Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4516920Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4517045Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4517241Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4517623Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4517735Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4517947Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4518110Z [rank0]:E1204 13:15:47.020000 554059 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4518186Z dist init r=0, world=4
2025-12-04T13:21:31.4518524Z [rank3]:[W1204 13:15:47.865831727 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4518856Z [rank1]:[W1204 13:15:47.870142828 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4519184Z [rank2]:[W1204 13:15:47.966213464 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4519525Z [rank0]:[W1204 13:15:47.967603962 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4519567Z FAILED [47.4467s] [100%]
2025-12-04T13:21:31.4519571Z 
2025-12-04T13:21:31.4519627Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4519752Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _
2025-12-04T13:21:31.4519798Z Traceback (most recent call last):
2025-12-04T13:21:31.4519963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4520005Z     self._join_processes(fn)
2025-12-04T13:21:31.4520180Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4520234Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4520422Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4520479Z     raise RuntimeError(error)
2025-12-04T13:21:31.4520559Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4520605Z Traceback (most recent call last):
2025-12-04T13:21:31.4520765Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4520829Z     getattr(self, test_name)()
2025-12-04T13:21:31.4520989Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4521024Z     fn()
2025-12-04T13:21:31.4521177Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4521217Z     method(*args, **kwargs)
2025-12-04T13:21:31.4521369Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4521409Z     method(*args, **kwargs)
2025-12-04T13:21:31.4521560Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4521598Z     with policy():
2025-12-04T13:21:31.4521749Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4521791Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4522171Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4522175Z 
2025-12-04T13:21:31.4522249Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4522507Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4522511Z 
2025-12-04T13:21:31.4522599Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4522601Z 
2025-12-04T13:21:31.4522602Z 
2025-12-04T13:21:31.4522679Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4522768Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4523004Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-52a6d20b2aed2cc3.xml -
2025-12-04T13:21:31.4523074Z =========================== short test summary info ============================
2025-12-04T13:21:31.4523347Z FAILED [47.4467s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4523395Z Traceback (most recent call last):
2025-12-04T13:21:31.4523561Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4523602Z     getattr(self, test_name)()
2025-12-04T13:21:31.4523763Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4523799Z     fn()
2025-12-04T13:21:31.4523950Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4523991Z     method(*args, **kwargs)
2025-12-04T13:21:31.4524141Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4524192Z     method(*args, **kwargs)
2025-12-04T13:21:31.4524342Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4524390Z     with policy():
2025-12-04T13:21:31.4524542Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4524582Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4524960Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4524973Z 
2025-12-04T13:21:31.4525050Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4525305Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4525308Z 
2025-12-04T13:21:31.4525396Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4525459Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4525522Z ====================== 1 failed, 18 deselected in 47.61s =======================
2025-12-04T13:21:31.4525559Z Got exit code 1
2025-12-04T13:21:31.4525600Z Retrying single test...
2025-12-04T13:21:31.4525791Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aef99b217eb6286.xml
2025-12-04T13:21:31.4525850Z ============================= test session starts ==============================
2025-12-04T13:21:31.4525963Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4526005Z cachedir: .pytest_cache
2025-12-04T13:21:31.4526163Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4526209Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4526249Z configfile: pytest.ini
2025-12-04T13:21:31.4526412Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4526487Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4526738Z stepcurrent: skipping 10 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4526782Z Running 1 items in this shard
2025-12-04T13:21:31.4526796Z 
2025-12-04T13:21:31.4527126Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda I1204 13:16:04.049000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 555469
2025-12-04T13:21:31.4527281Z I1204 13:16:04.050000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 555470
2025-12-04T13:21:31.4527433Z I1204 13:16:04.051000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 555471
2025-12-04T13:21:31.4527583Z I1204 13:16:04.052000 555400 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 555472
2025-12-04T13:21:31.4528226Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4528276Z   _warn_cpu_init()
2025-12-04T13:21:31.4528574Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4528611Z   _init_core_state(
2025-12-04T13:21:31.4529102Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4529178Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4529749Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4529788Z   _warn_cpu_init()
2025-12-04T13:21:31.4530081Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4530120Z   _init_core_state(
2025-12-04T13:21:31.4530614Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4530676Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4531245Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4531282Z   _warn_cpu_init()
2025-12-04T13:21:31.4531588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4531625Z   _init_core_state(
2025-12-04T13:21:31.4532113Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4532174Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4532752Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4532799Z   _warn_cpu_init()
2025-12-04T13:21:31.4533287Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4533355Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4533642Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4533685Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4534172Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4534230Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4534526Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py:479: UserWarning: FSDP is switching to use `NO_SHARD` instead of ShardingStrategy.FULL_SHARD since the world size is 1.
2025-12-04T13:21:31.4534563Z   _init_core_state(
2025-12-04T13:21:31.4535050Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4535109Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4535594Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.4535652Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.4535892Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4535936Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4536160Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4536202Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4536425Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4536467Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4536687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4536727Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4536947Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4536997Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4537216Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4537273Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4537493Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4537543Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4537762Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4537802Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4537947Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4538112Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4538438Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4538593Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4538881Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4539009Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4539287Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4539436Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4539713Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4539863Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4540159Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4540297Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4540577Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4540724Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4541232Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4541361Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4541572Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4541956Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4542089Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4542301Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4542467Z [rank1]:E1204 13:16:36.955000 555470 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4542507Z dist init r=1, world=4
2025-12-04T13:21:31.4542645Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4542805Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4543095Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4543250Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4543534Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4543660Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4543936Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4544084Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4544370Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4544517Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4544793Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4544928Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4545212Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4545363Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4545878Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 2. CUDA driver allocated memory was 2300575744 and is now 17469276160.
2025-12-04T13:21:31.4546006Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4546201Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4546597Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4546713Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4546924Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4547089Z [rank2]:E1204 13:16:37.004000 555471 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4547129Z dist init r=2, world=4
2025-12-04T13:21:31.4547267Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4547427Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4547715Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4547869Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4548189Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4548314Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4548604Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4548754Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4549028Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4549175Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4549449Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4549588Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4549878Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4550040Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4550544Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 3. CUDA driver allocated memory was 2250244096 and is now 17418944512.
2025-12-04T13:21:31.4550671Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4550868Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4551249Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4551364Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4551575Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4551740Z [rank3]:E1204 13:16:37.007000 555472 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4551780Z dist init r=3, world=4
2025-12-04T13:21:31.4551918Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4552078Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4552368Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4552522Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4552820Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4552946Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4553222Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4553370Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4553645Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4553791Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4554088Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4554224Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4554516Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4554665Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4555178Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 0. CUDA driver allocated memory was 2453667840 and is now 17622368256.
2025-12-04T13:21:31.4555292Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4555487Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4555870Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4555984Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4556196Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4556360Z [rank0]:E1204 13:16:37.015000 555469 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4556400Z dist init r=0, world=4
2025-12-04T13:21:31.4556738Z [rank1]:[W1204 13:16:37.802322408 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4557071Z [rank2]:[W1204 13:16:37.936681677 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4557410Z [rank3]:[W1204 13:16:37.945668464 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4557738Z [rank0]:[W1204 13:16:37.959967906 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4557779Z FAILED [47.1474s] [100%]
2025-12-04T13:21:31.4557781Z 
2025-12-04T13:21:31.4557840Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4557963Z _ TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda _
2025-12-04T13:21:31.4558009Z Traceback (most recent call last):
2025-12-04T13:21:31.4558206Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4558250Z     self._join_processes(fn)
2025-12-04T13:21:31.4558441Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4558507Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4558685Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4560900Z     raise RuntimeError(error)
2025-12-04T13:21:31.4560988Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4561064Z Traceback (most recent call last):
2025-12-04T13:21:31.4561236Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4561280Z     getattr(self, test_name)()
2025-12-04T13:21:31.4561442Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4561479Z     fn()
2025-12-04T13:21:31.4561631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4561675Z     method(*args, **kwargs)
2025-12-04T13:21:31.4561826Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4561868Z     method(*args, **kwargs)
2025-12-04T13:21:31.4562018Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4562058Z     with policy():
2025-12-04T13:21:31.4562210Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4562251Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4562634Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4562639Z 
2025-12-04T13:21:31.4562715Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4562973Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4562976Z 
2025-12-04T13:21:31.4563066Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4563068Z 
2025-12-04T13:21:31.4563070Z 
2025-12-04T13:21:31.4563150Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4563272Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4563510Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aef99b217eb6286.xml -
2025-12-04T13:21:31.4563572Z =========================== short test summary info ============================
2025-12-04T13:21:31.4563844Z FAILED [47.1474s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4563891Z Traceback (most recent call last):
2025-12-04T13:21:31.4564055Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4564098Z     getattr(self, test_name)()
2025-12-04T13:21:31.4564259Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4564294Z     fn()
2025-12-04T13:21:31.4564456Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4564497Z     method(*args, **kwargs)
2025-12-04T13:21:31.4564660Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4564700Z     method(*args, **kwargs)
2025-12-04T13:21:31.4564849Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4564886Z     with policy():
2025-12-04T13:21:31.4565049Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4565089Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4565471Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 80384 on device 1. CUDA driver allocated memory was 2317352960 and is now 17486053376.
2025-12-04T13:21:31.4565473Z 
2025-12-04T13:21:31.4565550Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4565807Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4565810Z 
2025-12-04T13:21:31.4565896Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4565961Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4566024Z ====================== 1 failed, 18 deselected in 47.29s =======================
2025-12-04T13:21:31.4566061Z Got exit code 1
2025-12-04T13:21:31.4566266Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda
2025-12-04T13:21:31.4566396Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4566586Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f32c471be9ed6c85.xml
2025-12-04T13:21:31.4566644Z ============================= test session starts ==============================
2025-12-04T13:21:31.4566758Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4566801Z cachedir: .pytest_cache
2025-12-04T13:21:31.4566959Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4567007Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4567046Z configfile: pytest.ini
2025-12-04T13:21:31.4567221Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4567298Z collecting ... collected 60 items / 11 deselected / 49 selected
2025-12-04T13:21:31.4567353Z stepcurrent: skipping 11 already run items.
2025-12-04T13:21:31.4567396Z Running 8 items in this shard
2025-12-04T13:21:31.4567400Z 
2025-12-04T13:21:31.4567726Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:16:53.929000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 556879
2025-12-04T13:21:31.4567883Z I1204 13:16:53.930000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 556880
2025-12-04T13:21:31.4568034Z I1204 13:16:53.930000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 556881
2025-12-04T13:21:31.4568223Z I1204 13:16:53.931000 556810 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 556882
2025-12-04T13:21:31.4568819Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4568871Z   _warn_cpu_init()
2025-12-04T13:21:31.4569454Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4569493Z   _warn_cpu_init()
2025-12-04T13:21:31.4570057Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4570095Z   _warn_cpu_init()
2025-12-04T13:21:31.4570663Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4570700Z   _warn_cpu_init()
2025-12-04T13:21:31.4570989Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4571033Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4571175Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4571340Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4571647Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4571803Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4572088Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4572214Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4572494Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4572643Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4572930Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4573089Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4573363Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4573509Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4573787Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4573937Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4574431Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4574548Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4574745Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4575125Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4575240Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4575453Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4575619Z [rank1]:E1204 13:17:01.847000 556880 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4575657Z dist init r=1, world=4
2025-12-04T13:21:31.4575807Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4575967Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4576255Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4576409Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4576695Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4576819Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4577105Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4577265Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4577540Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4577701Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4577976Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4578115Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4578453Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4578602Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4579098Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4579213Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4579409Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4579787Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4579903Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4580129Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4580294Z [rank3]:E1204 13:17:01.854000 556882 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4580333Z dist init r=3, world=4
2025-12-04T13:21:31.4580471Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4580631Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4580917Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4581073Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4581372Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4581509Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4581786Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4581935Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4582226Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4582373Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4582648Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4582784Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4583062Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4583210Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4583704Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952.
2025-12-04T13:21:31.4583819Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4584013Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4584408Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4584524Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4584735Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4584901Z [rank0]:E1204 13:17:01.899000 556879 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4584939Z dist init r=0, world=4
2025-12-04T13:21:31.4585077Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4585238Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4585526Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4585692Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4585986Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4586110Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4586402Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4586552Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4586830Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4586979Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4587253Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4587390Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4587668Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4587817Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4588354Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856.
2025-12-04T13:21:31.4588469Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4588679Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4589059Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4589174Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4589386Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4589552Z [rank2]:E1204 13:17:01.907000 556881 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4589590Z dist init r=2, world=4
2025-12-04T13:21:31.4589940Z [rank0]:[W1204 13:17:02.814163560 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4589981Z FAILED [9.8157s] [ 12%]
2025-12-04T13:21:31.4589997Z 
2025-12-04T13:21:31.4590053Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4590167Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _
2025-12-04T13:21:31.4590213Z Traceback (most recent call last):
2025-12-04T13:21:31.4590377Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4590436Z     self._join_processes(fn)
2025-12-04T13:21:31.4590609Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4590662Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4590842Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4590886Z     raise RuntimeError(error)
2025-12-04T13:21:31.4590968Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4591014Z Traceback (most recent call last):
2025-12-04T13:21:31.4591176Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4591218Z     getattr(self, test_name)()
2025-12-04T13:21:31.4591379Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4591414Z     fn()
2025-12-04T13:21:31.4591567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4591608Z     method(*args, **kwargs)
2025-12-04T13:21:31.4591761Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4591802Z     method(*args, **kwargs)
2025-12-04T13:21:31.4591952Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4591989Z     with policy():
2025-12-04T13:21:31.4592141Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4592182Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4592550Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4592554Z 
2025-12-04T13:21:31.4592640Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4592891Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4592894Z 
2025-12-04T13:21:31.4592982Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4592985Z 
2025-12-04T13:21:31.4592987Z 
2025-12-04T13:21:31.4593063Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4593151Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4593387Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-f32c471be9ed6c85.xml -
2025-12-04T13:21:31.4593448Z =========================== short test summary info ============================
2025-12-04T13:21:31.4593726Z FAILED [9.8157s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4593773Z Traceback (most recent call last):
2025-12-04T13:21:31.4593950Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4593993Z     getattr(self, test_name)()
2025-12-04T13:21:31.4594153Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4594199Z     fn()
2025-12-04T13:21:31.4594351Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4594389Z     method(*args, **kwargs)
2025-12-04T13:21:31.4594541Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4594580Z     method(*args, **kwargs)
2025-12-04T13:21:31.4594731Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4594768Z     with policy():
2025-12-04T13:21:31.4594921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4594961Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4595327Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4595330Z 
2025-12-04T13:21:31.4595405Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4595655Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4595657Z 
2025-12-04T13:21:31.4595745Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4595809Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4595871Z ======================= 1 failed, 11 deselected in 9.98s =======================
2025-12-04T13:21:31.4595908Z Got exit code 1
2025-12-04T13:21:31.4595949Z Retrying single test...
2025-12-04T13:21:31.4596138Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f90149723f3b30c.xml
2025-12-04T13:21:31.4596197Z ============================= test session starts ==============================
2025-12-04T13:21:31.4596321Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4596364Z cachedir: .pytest_cache
2025-12-04T13:21:31.4596522Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4596568Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4596609Z configfile: pytest.ini
2025-12-04T13:21:31.4596773Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4596848Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4597090Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4597137Z Running 1 items in this shard
2025-12-04T13:21:31.4597139Z 
2025-12-04T13:21:31.4597461Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:17:06.739000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 557281
2025-12-04T13:21:31.4597626Z I1204 13:17:06.740000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 557282
2025-12-04T13:21:31.4597788Z I1204 13:17:06.741000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 557283
2025-12-04T13:21:31.4597938Z I1204 13:17:06.741000 557212 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 557284
2025-12-04T13:21:31.4598568Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4598623Z   _warn_cpu_init()
2025-12-04T13:21:31.4599194Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4599233Z   _warn_cpu_init()
2025-12-04T13:21:31.4599800Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4599837Z   _warn_cpu_init()
2025-12-04T13:21:31.4600403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4600440Z   _warn_cpu_init()
2025-12-04T13:21:31.4600746Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4600789Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4600934Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4601098Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4601387Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4601544Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4601830Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4601967Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4602247Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4602416Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4602693Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4602852Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4603132Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4603270Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4603548Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4603697Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4604191Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4604309Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4604505Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4604881Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4605008Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4605222Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4605388Z [rank1]:E1204 13:17:14.730000 557282 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4605428Z dist init r=1, world=4
2025-12-04T13:21:31.4605566Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4605724Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4606013Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4606166Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4606463Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4606600Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4606876Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4607035Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4607313Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4607461Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4607739Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4607875Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4608206Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4608355Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4608847Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 78336 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856.
2025-12-04T13:21:31.4608963Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4609160Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4609547Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4609662Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4609875Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4610039Z [rank2]:E1204 13:17:14.733000 557283 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4610079Z dist init r=2, world=4
2025-12-04T13:21:31.4610216Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4610376Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4610677Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4610845Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4611131Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4611269Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4611546Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4611694Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4611971Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4612118Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4612398Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4612534Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4612812Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4612962Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4613454Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4613579Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4613775Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4614150Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4614265Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4614478Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4614646Z [rank3]:E1204 13:17:14.773000 557284 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4614684Z dist init r=3, world=4
2025-12-04T13:21:31.4614833Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4615002Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4615291Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4615455Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4615742Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4615865Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4616143Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4616293Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4616568Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4616717Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4616995Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4617132Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4617409Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4617560Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4618062Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 76288 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952.
2025-12-04T13:21:31.4618227Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4618424Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4618796Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4618912Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4619146Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4619310Z [rank0]:E1204 13:17:14.782000 557281 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4619363Z dist init r=0, world=4
2025-12-04T13:21:31.4619699Z [rank0]:[W1204 13:17:15.721097705 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4619755Z FAILED [10.1155s] [100%]
2025-12-04T13:21:31.4619758Z 
2025-12-04T13:21:31.4619815Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4619929Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _
2025-12-04T13:21:31.4619976Z Traceback (most recent call last):
2025-12-04T13:21:31.4620139Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4620184Z     self._join_processes(fn)
2025-12-04T13:21:31.4620356Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4620411Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4620588Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4620634Z     raise RuntimeError(error)
2025-12-04T13:21:31.4620715Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4620760Z Traceback (most recent call last):
2025-12-04T13:21:31.4620921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4620963Z     getattr(self, test_name)()
2025-12-04T13:21:31.4621122Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4621156Z     fn()
2025-12-04T13:21:31.4621308Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4621348Z     method(*args, **kwargs)
2025-12-04T13:21:31.4621498Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4621540Z     method(*args, **kwargs)
2025-12-04T13:21:31.4621690Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4621727Z     with policy():
2025-12-04T13:21:31.4621891Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4621932Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4622304Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4622308Z 
2025-12-04T13:21:31.4622384Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4622630Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4622634Z 
2025-12-04T13:21:31.4622721Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4622725Z 
2025-12-04T13:21:31.4622785Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4622830Z Traceback (most recent call last):
2025-12-04T13:21:31.4623004Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4623676Z     getattr(self, test_name)()
2025-12-04T13:21:31.4623835Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4623869Z     fn()
2025-12-04T13:21:31.4624020Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4624069Z     method(*args, **kwargs)
2025-12-04T13:21:31.4624220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4624258Z     method(*args, **kwargs)
2025-12-04T13:21:31.4624408Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4624445Z     with policy():
2025-12-04T13:21:31.4624598Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4624640Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4625003Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 78336 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856.
2025-12-04T13:21:31.4625005Z 
2025-12-04T13:21:31.4625080Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4625327Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4625331Z 
2025-12-04T13:21:31.4625418Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4625420Z 
2025-12-04T13:21:31.4625479Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4625524Z Traceback (most recent call last):
2025-12-04T13:21:31.4625686Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4625728Z     getattr(self, test_name)()
2025-12-04T13:21:31.4625886Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4625921Z     fn()
2025-12-04T13:21:31.4626070Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4626109Z     method(*args, **kwargs)
2025-12-04T13:21:31.4626268Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4626307Z     method(*args, **kwargs)
2025-12-04T13:21:31.4626458Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4626495Z     with policy():
2025-12-04T13:21:31.4626646Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4626686Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4627050Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4627053Z 
2025-12-04T13:21:31.4627125Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4627372Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4627384Z 
2025-12-04T13:21:31.4627471Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4627483Z 
2025-12-04T13:21:31.4627486Z 
2025-12-04T13:21:31.4627561Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4627649Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4627882Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4f90149723f3b30c.xml -
2025-12-04T13:21:31.4627953Z =========================== short test summary info ============================
2025-12-04T13:21:31.4628262Z FAILED [10.1155s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4628308Z Traceback (most recent call last):
2025-12-04T13:21:31.4628471Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4628514Z     getattr(self, test_name)()
2025-12-04T13:21:31.4628673Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4628707Z     fn()
2025-12-04T13:21:31.4628857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4628898Z     method(*args, **kwargs)
2025-12-04T13:21:31.4629048Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4629087Z     method(*args, **kwargs)
2025-12-04T13:21:31.4629237Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4629274Z     with policy():
2025-12-04T13:21:31.4629426Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4629467Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4629834Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 70144 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4629838Z 
2025-12-04T13:21:31.4629910Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4630172Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4630174Z 
2025-12-04T13:21:31.4630260Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4630263Z 
2025-12-04T13:21:31.4630322Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4630368Z Traceback (most recent call last):
2025-12-04T13:21:31.4630530Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4630571Z     getattr(self, test_name)()
2025-12-04T13:21:31.4630730Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4630764Z     fn()
2025-12-04T13:21:31.4630915Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4630954Z     method(*args, **kwargs)
2025-12-04T13:21:31.4631104Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4631142Z     method(*args, **kwargs)
2025-12-04T13:21:31.4631305Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4631354Z     with policy():
2025-12-04T13:21:31.4631505Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4631546Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4631909Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 78336 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856.
2025-12-04T13:21:31.4631923Z 
2025-12-04T13:21:31.4631997Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4632244Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4632246Z 
2025-12-04T13:21:31.4632334Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4632336Z 
2025-12-04T13:21:31.4632393Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4632438Z Traceback (most recent call last):
2025-12-04T13:21:31.4632599Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4632642Z     getattr(self, test_name)()
2025-12-04T13:21:31.4632800Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4632834Z     fn()
2025-12-04T13:21:31.4632987Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4633025Z     method(*args, **kwargs)
2025-12-04T13:21:31.4633176Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4633215Z     method(*args, **kwargs)
2025-12-04T13:21:31.4633365Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4633400Z     with policy():
2025-12-04T13:21:31.4633551Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4633591Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4633971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4633973Z 
2025-12-04T13:21:31.4634046Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4634293Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4634296Z 
2025-12-04T13:21:31.4634382Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4634446Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4634510Z ====================== 1 failed, 18 deselected in 10.28s =======================
2025-12-04T13:21:31.4634547Z Got exit code 1
2025-12-04T13:21:31.4634587Z Retrying single test...
2025-12-04T13:21:31.4634778Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d59e0d1f8082da7d.xml
2025-12-04T13:21:31.4634837Z ============================= test session starts ==============================
2025-12-04T13:21:31.4634959Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4635010Z cachedir: .pytest_cache
2025-12-04T13:21:31.4635168Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4635215Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4635254Z configfile: pytest.ini
2025-12-04T13:21:31.4635416Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4635501Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4635744Z stepcurrent: skipping 11 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4635788Z Running 1 items in this shard
2025-12-04T13:21:31.4635790Z 
2025-12-04T13:21:31.4636112Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda I1204 13:17:19.485000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 557683
2025-12-04T13:21:31.4636268Z I1204 13:17:19.486000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 557684
2025-12-04T13:21:31.4636421Z I1204 13:17:19.486000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 557685
2025-12-04T13:21:31.4636573Z I1204 13:17:19.487000 557614 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 557686
2025-12-04T13:21:31.4637157Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4637196Z   _warn_cpu_init()
2025-12-04T13:21:31.4637764Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4637802Z   _warn_cpu_init()
2025-12-04T13:21:31.4638420Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4638458Z   _warn_cpu_init()
2025-12-04T13:21:31.4639024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4639061Z   _warn_cpu_init()
2025-12-04T13:21:31.4639368Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4639424Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4639567Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4639729Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4640017Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4640185Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4640471Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4640598Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4640874Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4641023Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4641303Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4641451Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4641728Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4641864Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4642141Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4642299Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4642795Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4642912Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4643107Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4643485Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4643610Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4643833Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4643998Z [rank1]:E1204 13:17:27.533000 557684 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4644038Z dist init r=1, world=4
2025-12-04T13:21:31.4644187Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4644346Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4644634Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4644786Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4645071Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4645194Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4645471Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4645619Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4645894Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4646044Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4646318Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4646455Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4646746Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4646894Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4647385Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 74240 on device 0. CUDA driver allocated memory was 2453667840 and is now 3986685952.
2025-12-04T13:21:31.4647501Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4647697Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4648079Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4648239Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4648451Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4648635Z [rank0]:E1204 13:17:27.553000 557683 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4648673Z dist init r=0, world=4
2025-12-04T13:21:31.4648812Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4648972Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4649259Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4649412Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4649697Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4649821Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4650096Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4650244Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4650519Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4650668Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4650958Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4651094Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4651371Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4651519Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4652011Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 64000 on device 2. CUDA driver allocated memory was 2300575744 and is now 3833593856.
2025-12-04T13:21:31.4652167Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4652421Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4652795Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4652919Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4653131Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4653296Z [rank2]:E1204 13:17:27.592000 557685 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4653335Z dist init r=2, world=4
2025-12-04T13:21:31.4653473Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4653632Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4653918Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4654074Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4654358Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4654482Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4654758Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4654906Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4655192Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4655340Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4655618Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4655754Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4656031Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4656181Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4656679Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 66048 on device 3. CUDA driver allocated memory was 2250244096 and is now 3783262208.
2025-12-04T13:21:31.4656803Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4657000Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4657383Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4657498Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4657708Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4657873Z [rank3]:E1204 13:17:27.620000 557686 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4657911Z dist init r=3, world=4
2025-12-04T13:21:31.4658281Z [rank0]:[W1204 13:17:27.475103950 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4658322Z FAILED [10.0158s] [100%]
2025-12-04T13:21:31.4658324Z 
2025-12-04T13:21:31.4658381Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4658495Z _ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda _
2025-12-04T13:21:31.4658542Z Traceback (most recent call last):
2025-12-04T13:21:31.4658705Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4658748Z     self._join_processes(fn)
2025-12-04T13:21:31.4658921Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4658975Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4659153Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4659196Z     raise RuntimeError(error)
2025-12-04T13:21:31.4659289Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4659334Z Traceback (most recent call last):
2025-12-04T13:21:31.4659496Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4659538Z     getattr(self, test_name)()
2025-12-04T13:21:31.4659696Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4659729Z     fn()
2025-12-04T13:21:31.4659880Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4659921Z     method(*args, **kwargs)
2025-12-04T13:21:31.4660071Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4660109Z     method(*args, **kwargs)
2025-12-04T13:21:31.4660261Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4660298Z     with policy():
2025-12-04T13:21:31.4660463Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4660515Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4660887Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4660903Z 
2025-12-04T13:21:31.4660979Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4661226Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4661228Z 
2025-12-04T13:21:31.4661317Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4661320Z 
2025-12-04T13:21:31.4661321Z 
2025-12-04T13:21:31.4661396Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4661485Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4661716Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d59e0d1f8082da7d.xml -
2025-12-04T13:21:31.4661777Z =========================== short test summary info ============================
2025-12-04T13:21:31.4662040Z FAILED [10.0158s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4662088Z Traceback (most recent call last):
2025-12-04T13:21:31.4662253Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4662293Z     getattr(self, test_name)()
2025-12-04T13:21:31.4662454Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4662487Z     fn()
2025-12-04T13:21:31.4662639Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4662678Z     method(*args, **kwargs)
2025-12-04T13:21:31.4662831Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4662870Z     method(*args, **kwargs)
2025-12-04T13:21:31.4663021Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4663067Z     with policy():
2025-12-04T13:21:31.4663220Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4663260Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4663627Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 1. CUDA driver allocated memory was 2317352960 and is now 3850371072.
2025-12-04T13:21:31.4663630Z 
2025-12-04T13:21:31.4663703Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4663952Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4663954Z 
2025-12-04T13:21:31.4664042Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4664104Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4664176Z ====================== 1 failed, 18 deselected in 10.17s =======================
2025-12-04T13:21:31.4664229Z Got exit code 1
2025-12-04T13:21:31.4664425Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.4664553Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4664740Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6683d1b284d3f9c9.xml
2025-12-04T13:21:31.4664807Z ============================= test session starts ==============================
2025-12-04T13:21:31.4664920Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4664961Z cachedir: .pytest_cache
2025-12-04T13:21:31.4665121Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4665167Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4665208Z configfile: pytest.ini
2025-12-04T13:21:31.4665369Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4665444Z collecting ... collected 60 items / 12 deselected / 48 selected
2025-12-04T13:21:31.4665497Z stepcurrent: skipping 12 already run items.
2025-12-04T13:21:31.4665541Z Running 7 items in this shard
2025-12-04T13:21:31.4665544Z 
2025-12-04T13:21:31.4665854Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 13:17:32.412000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 558085
2025-12-04T13:21:31.4666009Z I1204 13:17:32.413000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 558086
2025-12-04T13:21:31.4666162Z I1204 13:17:32.414000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 558087
2025-12-04T13:21:31.4666312Z I1204 13:17:32.414000 558016 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 558088
2025-12-04T13:21:31.4666888Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4666926Z   _warn_cpu_init()
2025-12-04T13:21:31.4667228Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4667272Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4667842Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4667880Z   _warn_cpu_init()
2025-12-04T13:21:31.4668490Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4668543Z   _warn_cpu_init()
2025-12-04T13:21:31.4669105Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4669155Z   _warn_cpu_init()
2025-12-04T13:21:31.4669300Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4669463Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4669751Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4669907Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4670192Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4670318Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4670595Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4670745Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4671021Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4671170Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4671456Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4671594Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4671871Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4672019Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4672507Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 22016 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4672634Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4672828Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4673210Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4673337Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4673549Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4673714Z [rank0]:E1204 13:17:40.659000 558085 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4673753Z dist init r=0, world=4
2025-12-04T13:21:31.4673891Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4674049Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4674336Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4674492Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4674779Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4674905Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4675181Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4675329Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4675614Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4675761Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4676038Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4676174Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4676451Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4676600Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4677095Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4677220Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4677414Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4677786Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4677899Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4678111Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4678346Z [rank3]:E1204 13:17:40.715000 558088 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4678484Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4678642Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4678929Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4679085Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4679369Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4679494Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4679770Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4679934Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4680211Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4680358Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4680632Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4680768Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4681046Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4681214Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4681696Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 28160 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4681827Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4682034Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4682396Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4682509Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4682719Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4682881Z [rank1]:E1204 13:17:40.716000 558086 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4682920Z dist init r=3, world=4
2025-12-04T13:21:31.4682957Z dist init r=1, world=4
2025-12-04T13:21:31.4683095Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4683253Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4683541Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4683701Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4683984Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4684108Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4684393Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4684541Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4684815Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4684962Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4685238Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4685384Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4685661Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4685819Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4686301Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4686426Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4686621Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4686982Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4687096Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4687308Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4687471Z [rank2]:E1204 13:17:40.719000 558087 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4687511Z dist init r=2, world=4
2025-12-04T13:21:31.4687846Z [rank0]:[W1204 13:17:40.514976621 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4687887Z FAILED [10.1169s] [ 14%]
2025-12-04T13:21:31.4687890Z 
2025-12-04T13:21:31.4687946Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4688046Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __
2025-12-04T13:21:31.4688093Z Traceback (most recent call last):
2025-12-04T13:21:31.4688292Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4688348Z     self._join_processes(fn)
2025-12-04T13:21:31.4688522Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4688576Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4688756Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4688799Z     raise RuntimeError(error)
2025-12-04T13:21:31.4688880Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4688925Z Traceback (most recent call last):
2025-12-04T13:21:31.4689088Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4689130Z     getattr(self, test_name)()
2025-12-04T13:21:31.4689289Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4689323Z     fn()
2025-12-04T13:21:31.4689487Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4689529Z     method(*args, **kwargs)
2025-12-04T13:21:31.4689693Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4689733Z     method(*args, **kwargs)
2025-12-04T13:21:31.4689883Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4689920Z     with policy():
2025-12-04T13:21:31.4690083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4690124Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4690481Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 22016 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4690484Z 
2025-12-04T13:21:31.4690559Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4690795Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4690797Z 
2025-12-04T13:21:31.4690885Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4690888Z 
2025-12-04T13:21:31.4690890Z 
2025-12-04T13:21:31.4690964Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4691052Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4691285Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6683d1b284d3f9c9.xml -
2025-12-04T13:21:31.4691345Z =========================== short test summary info ============================
2025-12-04T13:21:31.4691597Z FAILED [10.1169s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4691645Z Traceback (most recent call last):
2025-12-04T13:21:31.4691809Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4691853Z     getattr(self, test_name)()
2025-12-04T13:21:31.4692013Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4692048Z     fn()
2025-12-04T13:21:31.4692209Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4692249Z     method(*args, **kwargs)
2025-12-04T13:21:31.4692401Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4692441Z     method(*args, **kwargs)
2025-12-04T13:21:31.4692590Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4692626Z     with policy():
2025-12-04T13:21:31.4692778Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4692820Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4693174Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 22016 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4693176Z 
2025-12-04T13:21:31.4693251Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4693493Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4693506Z 
2025-12-04T13:21:31.4693593Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4693656Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4693718Z ====================== 1 failed, 12 deselected in 10.28s =======================
2025-12-04T13:21:31.4693765Z Got exit code 1
2025-12-04T13:21:31.4693805Z Retrying single test...
2025-12-04T13:21:31.4693994Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d19ef080ca548a7a.xml
2025-12-04T13:21:31.4694052Z ============================= test session starts ==============================
2025-12-04T13:21:31.4694164Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4694204Z cachedir: .pytest_cache
2025-12-04T13:21:31.4694365Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4694410Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4694450Z configfile: pytest.ini
2025-12-04T13:21:31.4694611Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4694688Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4694916Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4694960Z Running 1 items in this shard
2025-12-04T13:21:31.4694963Z 
2025-12-04T13:21:31.4695271Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 13:17:45.344000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 558487
2025-12-04T13:21:31.4695426Z I1204 13:17:45.345000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 558488
2025-12-04T13:21:31.4695577Z I1204 13:17:45.345000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 558489
2025-12-04T13:21:31.4695726Z I1204 13:17:45.346000 558418 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 558490
2025-12-04T13:21:31.4696323Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4696361Z   _warn_cpu_init()
2025-12-04T13:21:31.4696931Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4696969Z   _warn_cpu_init()
2025-12-04T13:21:31.4697544Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4697591Z   _warn_cpu_init()
2025-12-04T13:21:31.4698194Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4698247Z   _warn_cpu_init()
2025-12-04T13:21:31.4698539Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4698582Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4698725Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4698886Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4699175Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4699333Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4699629Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4699754Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4700033Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4700182Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4700470Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4700620Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4700894Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4701032Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4701307Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4701456Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4701953Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4702081Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4702276Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4702647Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4702762Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4702973Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4703138Z [rank1]:E1204 13:17:53.699000 558488 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4703178Z dist init r=1, world=4
2025-12-04T13:21:31.4703315Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4703475Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4703762Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4703918Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4704203Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4704328Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4704604Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4704760Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4705036Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4705182Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4705457Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4705593Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4705871Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4706028Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4706523Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4706648Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4706843Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4707205Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4707318Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4707529Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4707693Z [rank3]:E1204 13:17:53.707000 558490 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4707732Z dist init r=3, world=4
2025-12-04T13:21:31.4707869Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4708029Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4708344Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4708498Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4708781Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4708904Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4709198Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4709346Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4709621Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4709768Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4710043Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4710190Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4710465Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4710626Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4711105Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4711248Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4711445Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4711807Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4711922Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4712132Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4712298Z [rank2]:E1204 13:17:53.751000 558489 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4712336Z dist init r=2, world=4
2025-12-04T13:21:31.4712474Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4712633Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4712919Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4713075Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4713370Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4713494Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4713770Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4713917Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4714193Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4714340Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4714626Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4714770Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4715047Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4715205Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4715686Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4715801Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4715996Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4716357Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4716470Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4716681Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4716844Z [rank0]:E1204 13:17:53.754000 558487 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4716884Z dist init r=0, world=4
2025-12-04T13:21:31.4717221Z [rank0]:[W1204 13:17:54.714627307 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4717263Z FAILED [10.3163s] [100%]
2025-12-04T13:21:31.4717265Z 
2025-12-04T13:21:31.4717321Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4717429Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __
2025-12-04T13:21:31.4717477Z Traceback (most recent call last):
2025-12-04T13:21:31.4717640Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4717685Z     self._join_processes(fn)
2025-12-04T13:21:31.4717857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4717910Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4718088Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4718132Z     raise RuntimeError(error)
2025-12-04T13:21:31.4718258Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4718305Z Traceback (most recent call last):
2025-12-04T13:21:31.4718466Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4718522Z     getattr(self, test_name)()
2025-12-04T13:21:31.4718680Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4718727Z     fn()
2025-12-04T13:21:31.4718878Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4718919Z     method(*args, **kwargs)
2025-12-04T13:21:31.4719069Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4719124Z     method(*args, **kwargs)
2025-12-04T13:21:31.4719275Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4719313Z     with policy():
2025-12-04T13:21:31.4719465Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4719507Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4719862Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4719866Z 
2025-12-04T13:21:31.4719941Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4720177Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4720179Z 
2025-12-04T13:21:31.4720266Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4720269Z 
2025-12-04T13:21:31.4720270Z 
2025-12-04T13:21:31.4720345Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4720434Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4720669Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d19ef080ca548a7a.xml -
2025-12-04T13:21:31.4720728Z =========================== short test summary info ============================
2025-12-04T13:21:31.4720980Z FAILED [10.3163s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4721028Z Traceback (most recent call last):
2025-12-04T13:21:31.4721191Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4721247Z     getattr(self, test_name)()
2025-12-04T13:21:31.4721407Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4721442Z     fn()
2025-12-04T13:21:31.4721594Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4721636Z     method(*args, **kwargs)
2025-12-04T13:21:31.4721787Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4721826Z     method(*args, **kwargs)
2025-12-04T13:21:31.4721976Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4722013Z     with policy():
2025-12-04T13:21:31.4722166Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4722206Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4722574Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 15872 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4722587Z 
2025-12-04T13:21:31.4722663Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4722895Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4722908Z 
2025-12-04T13:21:31.4722997Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4723059Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4723122Z ====================== 1 failed, 18 deselected in 10.45s =======================
2025-12-04T13:21:31.4723159Z Got exit code 1
2025-12-04T13:21:31.4723198Z Retrying single test...
2025-12-04T13:21:31.4723388Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-50f384147ce25093.xml
2025-12-04T13:21:31.4723446Z ============================= test session starts ==============================
2025-12-04T13:21:31.4723558Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4723598Z cachedir: .pytest_cache
2025-12-04T13:21:31.4723756Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4723802Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4723842Z configfile: pytest.ini
2025-12-04T13:21:31.4724004Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4724079Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4724307Z stepcurrent: skipping 12 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4724351Z Running 1 items in this shard
2025-12-04T13:21:31.4724353Z 
2025-12-04T13:21:31.4724661Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda I1204 13:17:58.220000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 558889
2025-12-04T13:21:31.4724816Z I1204 13:17:58.221000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 558890
2025-12-04T13:21:31.4724968Z I1204 13:17:58.221000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 558891
2025-12-04T13:21:31.4725129Z I1204 13:17:58.222000 558820 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 558892
2025-12-04T13:21:31.4725710Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4725748Z   _warn_cpu_init()
2025-12-04T13:21:31.4726316Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4726363Z   _warn_cpu_init()
2025-12-04T13:21:31.4726655Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4726716Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4727283Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4727330Z   _warn_cpu_init()
2025-12-04T13:21:31.4727898Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4727937Z   _warn_cpu_init()
2025-12-04T13:21:31.4728081Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4728280Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4728570Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4728725Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4729010Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4729137Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4729415Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4729576Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4729852Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4730001Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4730276Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4730414Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4730692Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4730854Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4731347Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4731475Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4731672Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4732035Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4732151Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4732361Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4732528Z [rank1]:E1204 13:18:06.354000 558890 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4732568Z dist init r=1, world=4
2025-12-04T13:21:31.4732705Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4732867Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4733154Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4733309Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4733591Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4733728Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4734005Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4734153Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4734429Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4734576Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4734851Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4734997Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4735287Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4735434Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4735913Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 24064 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4736042Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4736237Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4736598Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4736713Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4736928Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4737092Z [rank2]:E1204 13:18:06.357000 558891 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4737131Z dist init r=2, world=4
2025-12-04T13:21:31.4737270Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4737429Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4737715Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4737870Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4738196Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4738320Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4738596Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4738743Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4739020Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4739168Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4739466Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4739614Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4739890Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4740051Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4740536Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4740650Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4740846Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4741205Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4741321Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4741532Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4741696Z [rank0]:E1204 13:18:06.359000 558889 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4741734Z dist init r=0, world=4
2025-12-04T13:21:31.4741872Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4742031Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4742329Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4742484Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4742767Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4742891Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4743167Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4743316Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4743606Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4743763Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4744037Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4744182Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4744459Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4744608Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4745085Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4745201Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4745394Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4745759Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4745873Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4746083Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4746246Z [rank3]:E1204 13:18:06.404000 558892 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4746285Z dist init r=3, world=4
2025-12-04T13:21:31.4746629Z [rank0]:[W1204 13:18:06.217449301 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4746671Z FAILED [10.0148s] [100%]
2025-12-04T13:21:31.4746673Z 
2025-12-04T13:21:31.4746730Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4746830Z __ TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda __
2025-12-04T13:21:31.4746877Z Traceback (most recent call last):
2025-12-04T13:21:31.4747037Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4747083Z     self._join_processes(fn)
2025-12-04T13:21:31.4747254Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4747308Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4747485Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4747539Z     raise RuntimeError(error)
2025-12-04T13:21:31.4747619Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4747675Z Traceback (most recent call last):
2025-12-04T13:21:31.4747835Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4747877Z     getattr(self, test_name)()
2025-12-04T13:21:31.4748034Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4748079Z     fn()
2025-12-04T13:21:31.4748267Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4748308Z     method(*args, **kwargs)
2025-12-04T13:21:31.4748461Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4748501Z     method(*args, **kwargs)
2025-12-04T13:21:31.4748652Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4748690Z     with policy():
2025-12-04T13:21:31.4748841Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4748882Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4749240Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4749245Z 
2025-12-04T13:21:31.4749319Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4749555Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4749557Z 
2025-12-04T13:21:31.4749645Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4749648Z 
2025-12-04T13:21:31.4749649Z 
2025-12-04T13:21:31.4749724Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4749811Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4750043Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-50f384147ce25093.xml -
2025-12-04T13:21:31.4750103Z =========================== short test summary info ============================
2025-12-04T13:21:31.4750368Z FAILED [10.0148s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4750416Z Traceback (most recent call last):
2025-12-04T13:21:31.4750579Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4750622Z     getattr(self, test_name)()
2025-12-04T13:21:31.4750780Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4750815Z     fn()
2025-12-04T13:21:31.4750965Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4751007Z     method(*args, **kwargs)
2025-12-04T13:21:31.4751158Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4751197Z     method(*args, **kwargs)
2025-12-04T13:21:31.4751348Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4751404Z     with policy():
2025-12-04T13:21:31.4751556Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4751616Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4751971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda! Caching allocator allocated memory was 512 and is now reported as 19968 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4751993Z 
2025-12-04T13:21:31.4752067Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4752302Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4752304Z 
2025-12-04T13:21:31.4752391Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4752454Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4752517Z ====================== 1 failed, 18 deselected in 10.15s =======================
2025-12-04T13:21:31.4752554Z Got exit code 1
2025-12-04T13:21:31.4752736Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda
2025-12-04T13:21:31.4752865Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4753054Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d75a93e73e18887b.xml
2025-12-04T13:21:31.4753112Z ============================= test session starts ==============================
2025-12-04T13:21:31.4753223Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4753266Z cachedir: .pytest_cache
2025-12-04T13:21:31.4753424Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4753473Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4753512Z configfile: pytest.ini
2025-12-04T13:21:31.4753675Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4753749Z collecting ... collected 60 items / 13 deselected / 47 selected
2025-12-04T13:21:31.4753803Z stepcurrent: skipping 13 already run items.
2025-12-04T13:21:31.4753846Z Running 6 items in this shard
2025-12-04T13:21:31.4753849Z 
2025-12-04T13:21:31.4754177Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:18:10.698000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 559291
2025-12-04T13:21:31.4754334Z I1204 13:18:10.699000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 559292
2025-12-04T13:21:31.4754486Z I1204 13:18:10.700000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 559293
2025-12-04T13:21:31.4754636Z I1204 13:18:10.700000 559222 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 559294
2025-12-04T13:21:31.4755215Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4755254Z   _warn_cpu_init()
2025-12-04T13:21:31.4755835Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4755881Z   _warn_cpu_init()
2025-12-04T13:21:31.4756456Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4756493Z   _warn_cpu_init()
2025-12-04T13:21:31.4757056Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4757095Z   _warn_cpu_init()
2025-12-04T13:21:31.4757385Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4757429Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4757572Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4757734Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4758023Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4758207Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4758517Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4758643Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4758921Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4759069Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4759346Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4759493Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4759782Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4759918Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4760207Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4760355Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4760859Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4760976Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4761171Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4761541Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4761657Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4761870Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4762035Z [rank2]:E1204 13:18:18.748000 559293 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4762074Z dist init r=2, world=4
2025-12-04T13:21:31.4762212Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4762370Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4762657Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4762822Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4763109Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4763233Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4763509Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4763657Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4763933Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4764089Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4764373Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4764510Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4764795Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4764945Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4765435Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 16896 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4765551Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4765747Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4766114Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4766231Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4766442Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4766607Z [rank0]:E1204 13:18:18.767000 559291 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4766647Z dist init r=0, world=4
2025-12-04T13:21:31.4766784Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4766954Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4767240Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4767395Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4767679Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4767803Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4768127Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4768328Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4768604Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4768763Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4769037Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4769185Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4769463Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4769612Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4770097Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4770213Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4770409Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4770776Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4770893Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4771105Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4771269Z [rank3]:E1204 13:18:18.769000 559294 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4771321Z dist init r=3, world=4
2025-12-04T13:21:31.4771458Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4771617Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4771904Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4772057Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4772343Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4772467Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4772752Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4772911Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4773188Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4773354Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4773631Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4773767Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4774044Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4774193Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4774677Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 23040 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4774793Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4774988Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4775352Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4775469Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4775688Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4775853Z [rank1]:E1204 13:18:18.833000 559292 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4775892Z dist init r=1, world=4
2025-12-04T13:21:31.4776229Z [rank0]:[W1204 13:18:19.727574385 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4776270Z FAILED [10.0155s] [ 16%]
2025-12-04T13:21:31.4776273Z 
2025-12-04T13:21:31.4776328Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4776438Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.4776485Z Traceback (most recent call last):
2025-12-04T13:21:31.4776647Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4776699Z     self._join_processes(fn)
2025-12-04T13:21:31.4776871Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4776935Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4777113Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4777156Z     raise RuntimeError(error)
2025-12-04T13:21:31.4777249Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4777294Z Traceback (most recent call last):
2025-12-04T13:21:31.4777456Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4777498Z     getattr(self, test_name)()
2025-12-04T13:21:31.4777659Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4777693Z     fn()
2025-12-04T13:21:31.4777845Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4777885Z     method(*args, **kwargs)
2025-12-04T13:21:31.4778038Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4778076Z     method(*args, **kwargs)
2025-12-04T13:21:31.4778274Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4778311Z     with policy():
2025-12-04T13:21:31.4778463Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4778505Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4778868Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4778872Z 
2025-12-04T13:21:31.4778947Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4779191Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4779194Z 
2025-12-04T13:21:31.4779283Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4779285Z 
2025-12-04T13:21:31.4779287Z 
2025-12-04T13:21:31.4779377Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4779466Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4779697Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-d75a93e73e18887b.xml -
2025-12-04T13:21:31.4779759Z =========================== short test summary info ============================
2025-12-04T13:21:31.4780016Z FAILED [10.0155s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.4780062Z Traceback (most recent call last):
2025-12-04T13:21:31.4780226Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4780268Z     getattr(self, test_name)()
2025-12-04T13:21:31.4780427Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4780461Z     fn()
2025-12-04T13:21:31.4780624Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4780675Z     method(*args, **kwargs)
2025-12-04T13:21:31.4780826Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4780865Z     method(*args, **kwargs)
2025-12-04T13:21:31.4781015Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4781066Z     with policy():
2025-12-04T13:21:31.4781218Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4781258Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4781626Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4781629Z 
2025-12-04T13:21:31.4781705Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4781944Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4781946Z 
2025-12-04T13:21:31.4782036Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4782100Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4782163Z ====================== 1 failed, 13 deselected in 10.15s =======================
2025-12-04T13:21:31.4782200Z Got exit code 1
2025-12-04T13:21:31.4782243Z Retrying single test...
2025-12-04T13:21:31.4782432Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-35564a50697736ba.xml
2025-12-04T13:21:31.4782490Z ============================= test session starts ==============================
2025-12-04T13:21:31.4782602Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4782644Z cachedir: .pytest_cache
2025-12-04T13:21:31.4782801Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4782848Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4782888Z configfile: pytest.ini
2025-12-04T13:21:31.4783050Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4783124Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4783370Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4783415Z Running 1 items in this shard
2025-12-04T13:21:31.4783418Z 
2025-12-04T13:21:31.4783734Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:18:23.231000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 559693
2025-12-04T13:21:31.4783889Z I1204 13:18:23.232000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 559694
2025-12-04T13:21:31.4784041Z I1204 13:18:23.233000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 559695
2025-12-04T13:21:31.4784192Z I1204 13:18:23.233000 559624 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 559696
2025-12-04T13:21:31.4784781Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4784828Z   _warn_cpu_init()
2025-12-04T13:21:31.4785397Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4785538Z   _warn_cpu_init()
2025-12-04T13:21:31.4788024Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4788067Z   _warn_cpu_init()
2025-12-04T13:21:31.4788698Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4788735Z   _warn_cpu_init()
2025-12-04T13:21:31.4789030Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4789074Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4789219Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4789381Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4789709Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4789867Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4790152Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4790278Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4790556Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4790706Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4790998Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4791159Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4791436Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4791573Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4791865Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4792013Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4792504Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4792621Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4792818Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4793189Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4793304Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4793518Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4793683Z [rank1]:E1204 13:18:31.190000 559694 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4793723Z dist init r=1, world=4
2025-12-04T13:21:31.4793861Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4794030Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4794318Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4794472Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4794757Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4794881Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4795158Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4795315Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4795601Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4795749Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4796036Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4796174Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4796451Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4796600Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4797086Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4797203Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4797399Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4797770Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4797885Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4798097Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4798314Z [rank0]:E1204 13:18:31.203000 559693 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4798353Z dist init r=0, world=4
2025-12-04T13:21:31.4798491Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4798652Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4798939Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4799093Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4799379Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4799515Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4799790Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4799953Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4800228Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4800388Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4800663Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4800800Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4801079Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4801228Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4801714Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4801828Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4802023Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4802391Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4802514Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4802728Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4802892Z [rank3]:E1204 13:18:31.209000 559696 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4802932Z dist init r=3, world=4
2025-12-04T13:21:31.4803068Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4803227Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4803514Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4803668Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4803965Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4804102Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4804379Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4804550Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4804829Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4804975Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4805251Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4805386Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4805665Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4805814Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4806301Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4806417Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4806612Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4807003Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4807118Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4807329Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4807493Z [rank2]:E1204 13:18:31.260000 559695 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4807531Z dist init r=2, world=4
2025-12-04T13:21:31.4807869Z [rank0]:[W1204 13:18:31.085636797 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4807908Z FAILED [9.8154s] [100%]
2025-12-04T13:21:31.4807912Z 
2025-12-04T13:21:31.4807981Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4808100Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.4808174Z Traceback (most recent call last):
2025-12-04T13:21:31.4808339Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4808383Z     self._join_processes(fn)
2025-12-04T13:21:31.4808554Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4808626Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4808804Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4808848Z     raise RuntimeError(error)
2025-12-04T13:21:31.4808929Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4808974Z Traceback (most recent call last):
2025-12-04T13:21:31.4809136Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4809178Z     getattr(self, test_name)()
2025-12-04T13:21:31.4809335Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4809369Z     fn()
2025-12-04T13:21:31.4809523Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4809563Z     method(*args, **kwargs)
2025-12-04T13:21:31.4809714Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4809753Z     method(*args, **kwargs)
2025-12-04T13:21:31.4809903Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4809940Z     with policy():
2025-12-04T13:21:31.4810093Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4810132Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4810494Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4810498Z 
2025-12-04T13:21:31.4810573Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4810829Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4810832Z 
2025-12-04T13:21:31.4810921Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4810924Z 
2025-12-04T13:21:31.4810926Z 
2025-12-04T13:21:31.4811002Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4811089Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4811320Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-35564a50697736ba.xml -
2025-12-04T13:21:31.4811381Z =========================== short test summary info ============================
2025-12-04T13:21:31.4811638Z FAILED [9.8154s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4811684Z Traceback (most recent call last):
2025-12-04T13:21:31.4811862Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4811917Z     getattr(self, test_name)()
2025-12-04T13:21:31.4812076Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4812111Z     fn()
2025-12-04T13:21:31.4812262Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4812312Z     method(*args, **kwargs)
2025-12-04T13:21:31.4812463Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4812502Z     method(*args, **kwargs)
2025-12-04T13:21:31.4812653Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4812690Z     with policy():
2025-12-04T13:21:31.4812844Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4812885Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4813244Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 20992 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4813247Z 
2025-12-04T13:21:31.4813321Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4813562Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4813565Z 
2025-12-04T13:21:31.4813651Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4813716Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4813778Z ======================= 1 failed, 18 deselected in 9.95s =======================
2025-12-04T13:21:31.4813816Z Got exit code 1
2025-12-04T13:21:31.4813855Z Retrying single test...
2025-12-04T13:21:31.4814043Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-168828f8a7ed70a3.xml
2025-12-04T13:21:31.4814101Z ============================= test session starts ==============================
2025-12-04T13:21:31.4814215Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4814256Z cachedir: .pytest_cache
2025-12-04T13:21:31.4814427Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4814474Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4814513Z configfile: pytest.ini
2025-12-04T13:21:31.4814679Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4814753Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4814989Z stepcurrent: skipping 13 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4815032Z Running 1 items in this shard
2025-12-04T13:21:31.4815035Z 
2025-12-04T13:21:31.4815352Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda I1204 13:18:35.655000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 560095
2025-12-04T13:21:31.4815508Z I1204 13:18:35.656000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 560096
2025-12-04T13:21:31.4815673Z I1204 13:18:35.657000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 560097
2025-12-04T13:21:31.4815835Z I1204 13:18:35.657000 560026 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 560098
2025-12-04T13:21:31.4816414Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4816461Z   _warn_cpu_init()
2025-12-04T13:21:31.4817031Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4817070Z   _warn_cpu_init()
2025-12-04T13:21:31.4817634Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4817673Z   _warn_cpu_init()
2025-12-04T13:21:31.4818276Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4818313Z   _warn_cpu_init()
2025-12-04T13:21:31.4818604Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4818647Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4818805Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4818968Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4819257Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4819413Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4819700Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4819827Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4820116Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4820286Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4820562Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4820722Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4820998Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4821135Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4821412Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4821562Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4822053Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4822168Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4822365Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4822735Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4822851Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4823062Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4823237Z [rank0]:E1204 13:18:43.612000 560095 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4823277Z dist init r=0, world=4
2025-12-04T13:21:31.4823414Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4823574Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4823860Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4824015Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4824319Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4824444Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4824731Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4824878Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4825165Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4825311Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4825588Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4825725Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4826002Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4826152Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4826642Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4826758Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4826952Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4827321Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4827447Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4827658Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4827823Z [rank3]:E1204 13:18:43.616000 560098 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4827860Z dist init r=3, world=4
2025-12-04T13:21:31.4827998Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4828201Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4828488Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4828656Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4828953Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4829077Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4829356Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4829520Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4829796Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4829944Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4830219Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4830357Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4830634Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4830781Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4831266Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 1. CUDA driver allocated memory was 2317352960 and is now 3827302400.
2025-12-04T13:21:31.4831381Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4831576Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4831960Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4832075Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4832286Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4832448Z [rank1]:E1204 13:18:43.620000 560096 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4832487Z dist init r=1, world=4
2025-12-04T13:21:31.4832626Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4832786Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4833083Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4833247Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4833531Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4833668Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4833945Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4834092Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4834368Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4834515Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4834793Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4834929Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4835206Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4835354Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4835836Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 12800 on device 2. CUDA driver allocated memory was 2300575744 and is now 3810525184.
2025-12-04T13:21:31.4835962Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4836159Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4836526Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4836639Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4836850Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4837015Z [rank2]:E1204 13:18:43.672000 560097 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4837053Z dist init r=2, world=4
2025-12-04T13:21:31.4837409Z [rank0]:[W1204 13:18:43.461412588 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4837458Z FAILED [9.8141s] [100%]
2025-12-04T13:21:31.4837460Z 
2025-12-04T13:21:31.4837517Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4837637Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda _
2025-12-04T13:21:31.4837684Z Traceback (most recent call last):
2025-12-04T13:21:31.4837846Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4837890Z     self._join_processes(fn)
2025-12-04T13:21:31.4838063Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4838117Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4838340Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4838382Z     raise RuntimeError(error)
2025-12-04T13:21:31.4838463Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4838507Z Traceback (most recent call last):
2025-12-04T13:21:31.4838669Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4838710Z     getattr(self, test_name)()
2025-12-04T13:21:31.4838869Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4838902Z     fn()
2025-12-04T13:21:31.4839055Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4839095Z     method(*args, **kwargs)
2025-12-04T13:21:31.4839247Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4839286Z     method(*args, **kwargs)
2025-12-04T13:21:31.4839436Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4839472Z     with policy():
2025-12-04T13:21:31.4839625Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4839664Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4840043Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4840046Z 
2025-12-04T13:21:31.4840121Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4840365Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4840368Z 
2025-12-04T13:21:31.4840457Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4840460Z 
2025-12-04T13:21:31.4840521Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4840567Z Traceback (most recent call last):
2025-12-04T13:21:31.4840729Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4840772Z     getattr(self, test_name)()
2025-12-04T13:21:31.4840944Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4840979Z     fn()
2025-12-04T13:21:31.4841142Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4841181Z     method(*args, **kwargs)
2025-12-04T13:21:31.4841331Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4841372Z     method(*args, **kwargs)
2025-12-04T13:21:31.4841536Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4841572Z     with policy():
2025-12-04T13:21:31.4841723Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4841765Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4842123Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4842128Z 
2025-12-04T13:21:31.4842201Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4842442Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4842445Z 
2025-12-04T13:21:31.4842532Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4842534Z 
2025-12-04T13:21:31.4842536Z 
2025-12-04T13:21:31.4842613Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4842700Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4842933Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-168828f8a7ed70a3.xml -
2025-12-04T13:21:31.4842995Z =========================== short test summary info ============================
2025-12-04T13:21:31.4843251Z FAILED [9.8141s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4843296Z Traceback (most recent call last):
2025-12-04T13:21:31.4843461Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4843502Z     getattr(self, test_name)()
2025-12-04T13:21:31.4843671Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4843705Z     fn()
2025-12-04T13:21:31.4843857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4843898Z     method(*args, **kwargs)
2025-12-04T13:21:31.4844049Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4844088Z     method(*args, **kwargs)
2025-12-04T13:21:31.4844238Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4844275Z     with policy():
2025-12-04T13:21:31.4844425Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4844465Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4844835Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 0. CUDA driver allocated memory was 2453667840 and is now 3963617280.
2025-12-04T13:21:31.4844839Z 
2025-12-04T13:21:31.4844921Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4845163Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4845165Z 
2025-12-04T13:21:31.4845252Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4845268Z 
2025-12-04T13:21:31.4845327Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4845371Z Traceback (most recent call last):
2025-12-04T13:21:31.4845534Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4845575Z     getattr(self, test_name)()
2025-12-04T13:21:31.4845734Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4845768Z     fn()
2025-12-04T13:21:31.4845919Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4845959Z     method(*args, **kwargs)
2025-12-04T13:21:31.4846108Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4846146Z     method(*args, **kwargs)
2025-12-04T13:21:31.4846296Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4846332Z     with policy():
2025-12-04T13:21:31.4846484Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4846524Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4846883Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 25088 on device 3. CUDA driver allocated memory was 2250244096 and is now 3760193536.
2025-12-04T13:21:31.4846886Z 
2025-12-04T13:21:31.4846959Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4847201Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4847205Z 
2025-12-04T13:21:31.4847292Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4847356Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4847429Z ======================= 1 failed, 18 deselected in 9.95s =======================
2025-12-04T13:21:31.4847466Z Got exit code 1
2025-12-04T13:21:31.4847657Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda
2025-12-04T13:21:31.4847787Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4847976Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a4b1e12efc4a33d8.xml
2025-12-04T13:21:31.4848035Z ============================= test session starts ==============================
2025-12-04T13:21:31.4848190Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4848231Z cachedir: .pytest_cache
2025-12-04T13:21:31.4848389Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4848436Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4848476Z configfile: pytest.ini
2025-12-04T13:21:31.4848652Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4848738Z collecting ... collected 60 items / 14 deselected / 46 selected
2025-12-04T13:21:31.4848791Z stepcurrent: skipping 14 already run items.
2025-12-04T13:21:31.4848835Z Running 5 items in this shard
2025-12-04T13:21:31.4848837Z 
2025-12-04T13:21:31.4849193Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 13:18:48.110000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 560497
2025-12-04T13:21:31.4849361Z I1204 13:18:48.110000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 560498
2025-12-04T13:21:31.4849513Z I1204 13:18:48.111000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 560499
2025-12-04T13:21:31.4849666Z I1204 13:18:48.112000 560428 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 560500
2025-12-04T13:21:31.4849960Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4850013Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4850591Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4850630Z   _warn_cpu_init()
2025-12-04T13:21:31.4850920Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4851000Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4851285Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4851337Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4851930Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4851968Z   _warn_cpu_init()
2025-12-04T13:21:31.4852255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4852331Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4852618Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4852666Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4853246Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4853292Z   _warn_cpu_init()
2025-12-04T13:21:31.4853577Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4853662Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4853946Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4853996Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4854566Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4854604Z   _warn_cpu_init()
2025-12-04T13:21:31.4854890Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4854963Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4855193Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4855235Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4855459Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4855500Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4855722Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4855763Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4855994Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4856034Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4856254Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4856296Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4856514Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4856554Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4856775Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4856815Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4857034Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4857085Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4857377Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4857429Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4857573Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4857748Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4858038Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4858230Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4858517Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4858642Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4858921Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4859073Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4859351Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4859498Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4859775Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4859913Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4860204Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4860353Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4860885Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608.
2025-12-04T13:21:31.4861002Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4861199Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4861626Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4861757Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4861967Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4862144Z [rank3]:E1204 13:18:54.044000 560500 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4862182Z dist init r=3, world=4
2025-12-04T13:21:31.4862321Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4862482Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4862770Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4862924Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4863208Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4863334Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4863610Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4863759Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4864035Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4864185Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4864474Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4864613Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4864891Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4865038Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4865567Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4865692Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4865888Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4866308Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4866438Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4866653Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4866819Z [rank0]:E1204 13:18:54.045000 560497 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4866957Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4867115Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4867402Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4867556Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4867842Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4867965Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4868274Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4868422Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4868709Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4868857Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4869134Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4869272Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4869549Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4869697Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4870238Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472.
2025-12-04T13:21:31.4870363Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4870558Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4870976Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4871091Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4871302Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4871467Z [rank1]:E1204 13:18:54.045000 560498 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4871507Z dist init r=0, world=4
2025-12-04T13:21:31.4871544Z dist init r=1, world=4
2025-12-04T13:21:31.4871682Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4871842Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4872132Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4872285Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4872569Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4872695Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4872982Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4873130Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4873405Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4873554Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4873827Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4873964Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4874253Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4874410Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4874934Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256.
2025-12-04T13:21:31.4875058Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4875255Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4875663Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4875777Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4875988Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4876154Z [rank2]:E1204 13:18:54.093000 560499 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4876193Z dist init r=2, world=4
2025-12-04T13:21:31.4876528Z [rank0]:[W1204 13:18:54.959531404 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4876569Z FAILED [7.7129s] [ 20%]
2025-12-04T13:21:31.4876571Z 
2025-12-04T13:21:31.4876627Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4876773Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _
2025-12-04T13:21:31.4876820Z Traceback (most recent call last):
2025-12-04T13:21:31.4876985Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4877027Z     self._join_processes(fn)
2025-12-04T13:21:31.4877209Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4877265Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4877441Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4877486Z     raise RuntimeError(error)
2025-12-04T13:21:31.4877566Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4877611Z Traceback (most recent call last):
2025-12-04T13:21:31.4877771Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4877816Z     getattr(self, test_name)()
2025-12-04T13:21:31.4877973Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4878007Z     fn()
2025-12-04T13:21:31.4878195Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4878250Z     method(*args, **kwargs)
2025-12-04T13:21:31.4878401Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4878454Z     method(*args, **kwargs)
2025-12-04T13:21:31.4878603Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4878640Z     with policy():
2025-12-04T13:21:31.4878791Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4878845Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4879249Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4879252Z 
2025-12-04T13:21:31.4879328Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4879611Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4879613Z 
2025-12-04T13:21:31.4879702Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4879705Z 
2025-12-04T13:21:31.4879764Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4879809Z Traceback (most recent call last):
2025-12-04T13:21:31.4879973Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4880016Z     getattr(self, test_name)()
2025-12-04T13:21:31.4880176Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4880209Z     fn()
2025-12-04T13:21:31.4880361Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4880401Z     method(*args, **kwargs)
2025-12-04T13:21:31.4880552Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4880590Z     method(*args, **kwargs)
2025-12-04T13:21:31.4880740Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4880777Z     with policy():
2025-12-04T13:21:31.4880929Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4880981Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4881385Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472.
2025-12-04T13:21:31.4881388Z 
2025-12-04T13:21:31.4881463Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4881742Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4881745Z 
2025-12-04T13:21:31.4881832Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4881834Z 
2025-12-04T13:21:31.4881836Z 
2025-12-04T13:21:31.4881912Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4882000Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4882246Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a4b1e12efc4a33d8.xml -
2025-12-04T13:21:31.4882324Z =========================== short test summary info ============================
2025-12-04T13:21:31.4882621Z FAILED [7.7129s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4882678Z Traceback (most recent call last):
2025-12-04T13:21:31.4882841Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4882882Z     getattr(self, test_name)()
2025-12-04T13:21:31.4883043Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4883077Z     fn()
2025-12-04T13:21:31.4883229Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4883269Z     method(*args, **kwargs)
2025-12-04T13:21:31.4883419Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4883457Z     method(*args, **kwargs)
2025-12-04T13:21:31.4883607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4883644Z     with policy():
2025-12-04T13:21:31.4883796Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4883835Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4884238Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4884241Z 
2025-12-04T13:21:31.4884314Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4884594Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4884597Z 
2025-12-04T13:21:31.4884684Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4884687Z 
2025-12-04T13:21:31.4884745Z Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.4884790Z Traceback (most recent call last):
2025-12-04T13:21:31.4886156Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4886201Z     getattr(self, test_name)()
2025-12-04T13:21:31.4886362Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4886397Z     fn()
2025-12-04T13:21:31.4886547Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4886586Z     method(*args, **kwargs)
2025-12-04T13:21:31.4886735Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4886775Z     method(*args, **kwargs)
2025-12-04T13:21:31.4886923Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4886959Z     with policy():
2025-12-04T13:21:31.4887111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4887166Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4887566Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 162304 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472.
2025-12-04T13:21:31.4887583Z 
2025-12-04T13:21:31.4887655Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4887942Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4887945Z 
2025-12-04T13:21:31.4888033Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4888097Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4888198Z ======================= 1 failed, 14 deselected in 7.88s =======================
2025-12-04T13:21:31.4888236Z Got exit code 1
2025-12-04T13:21:31.4888276Z Retrying single test...
2025-12-04T13:21:31.4888465Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6d3b6c109b160d41.xml
2025-12-04T13:21:31.4888523Z ============================= test session starts ==============================
2025-12-04T13:21:31.4888636Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4888677Z cachedir: .pytest_cache
2025-12-04T13:21:31.4888836Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4888882Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4888922Z configfile: pytest.ini
2025-12-04T13:21:31.4889086Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4889161Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4889437Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4889480Z Running 1 items in this shard
2025-12-04T13:21:31.4889482Z 
2025-12-04T13:21:31.4889837Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 13:18:58.464000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 560899
2025-12-04T13:21:31.4890013Z I1204 13:18:58.465000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 560900
2025-12-04T13:21:31.4890166Z I1204 13:18:58.466000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 560901
2025-12-04T13:21:31.4890315Z I1204 13:18:58.467000 560830 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 560902
2025-12-04T13:21:31.4890607Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4890659Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4891255Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4891293Z   _warn_cpu_init()
2025-12-04T13:21:31.4891593Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4891642Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4892212Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4892264Z   _warn_cpu_init()
2025-12-04T13:21:31.4892554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4892631Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4892917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4892992Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4893277Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4893326Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4893897Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4893935Z   _warn_cpu_init()
2025-12-04T13:21:31.4894221Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4894305Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4894588Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4894638Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4895205Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4895242Z   _warn_cpu_init()
2025-12-04T13:21:31.4895530Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4895615Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4895844Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4895897Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4896120Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4896172Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4896394Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4896434Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4896657Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4896696Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4896917Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4896956Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4897175Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4897215Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4897434Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4897476Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4897697Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4897738Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4898029Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4898070Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4898268Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4898431Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4898736Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4898892Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4899177Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4899302Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4899582Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4899730Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4900081Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4900242Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4900520Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4900674Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4900953Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4901102Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4901632Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4901749Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4901945Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4902356Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4902474Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4902684Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4902850Z [rank0]:E1204 13:19:04.417000 560899 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4902899Z dist init r=0, world=4
2025-12-04T13:21:31.4903039Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4903199Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4903486Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4903639Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4903924Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4904049Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4904335Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4904493Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4904767Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4904924Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4905198Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4905337Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4905620Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4905769Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4906298Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 166400 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472.
2025-12-04T13:21:31.4906413Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4906610Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4907017Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4907135Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4907359Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4907524Z [rank1]:E1204 13:19:04.428000 560900 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4907563Z dist init r=1, world=4
2025-12-04T13:21:31.4907700Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4907861Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4908195Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4908351Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4908649Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4908785Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4909061Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4909221Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4909497Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4909644Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4909919Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4910054Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4910332Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4910484Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4911011Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 147968 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256.
2025-12-04T13:21:31.4911127Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4911322Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4911742Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4911860Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4912072Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4912236Z [rank2]:E1204 13:19:04.494000 560901 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4912275Z dist init r=2, world=4
2025-12-04T13:21:31.4912412Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4912571Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4912868Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4913030Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4913318Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4913452Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4913729Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4913877Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4914155Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4914302Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4914576Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4914715Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4914992Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4915141Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4915667Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608.
2025-12-04T13:21:31.4915801Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4915998Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4916405Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4916522Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4916735Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4916899Z [rank3]:E1204 13:19:04.507000 560902 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4916939Z dist init r=3, world=4
2025-12-04T13:21:31.4917284Z [rank0]:[W1204 13:19:04.272107138 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4917333Z FAILED [7.5126s] [100%]
2025-12-04T13:21:31.4917335Z 
2025-12-04T13:21:31.4917392Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4917537Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _
2025-12-04T13:21:31.4917594Z Traceback (most recent call last):
2025-12-04T13:21:31.4917758Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4917801Z     self._join_processes(fn)
2025-12-04T13:21:31.4917975Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4918028Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4918259Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4918303Z     raise RuntimeError(error)
2025-12-04T13:21:31.4918384Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4918429Z Traceback (most recent call last):
2025-12-04T13:21:31.4918591Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4918633Z     getattr(self, test_name)()
2025-12-04T13:21:31.4918791Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4918826Z     fn()
2025-12-04T13:21:31.4918978Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4919019Z     method(*args, **kwargs)
2025-12-04T13:21:31.4919171Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4919211Z     method(*args, **kwargs)
2025-12-04T13:21:31.4919361Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4919398Z     with policy():
2025-12-04T13:21:31.4919550Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4919591Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4920007Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4920010Z 
2025-12-04T13:21:31.4920087Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4920368Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4920371Z 
2025-12-04T13:21:31.4920459Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4920462Z 
2025-12-04T13:21:31.4920464Z 
2025-12-04T13:21:31.4920540Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4920627Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4920874Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-6d3b6c109b160d41.xml -
2025-12-04T13:21:31.4920935Z =========================== short test summary info ============================
2025-12-04T13:21:31.4921243Z FAILED [7.5126s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4921289Z Traceback (most recent call last):
2025-12-04T13:21:31.4921453Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4921695Z     getattr(self, test_name)()
2025-12-04T13:21:31.4921857Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4921891Z     fn()
2025-12-04T13:21:31.4922043Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4922084Z     method(*args, **kwargs)
2025-12-04T13:21:31.4922234Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4922274Z     method(*args, **kwargs)
2025-12-04T13:21:31.4922423Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4922461Z     with policy():
2025-12-04T13:21:31.4922611Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4922652Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4923054Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 158208 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4923058Z 
2025-12-04T13:21:31.4923132Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4923413Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4923415Z 
2025-12-04T13:21:31.4923503Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4923566Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4923628Z ======================= 1 failed, 18 deselected in 7.65s =======================
2025-12-04T13:21:31.4923665Z Got exit code 1
2025-12-04T13:21:31.4923705Z Retrying single test...
2025-12-04T13:21:31.4923904Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7de742e840eb07e8.xml
2025-12-04T13:21:31.4923962Z ============================= test session starts ==============================
2025-12-04T13:21:31.4924076Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4924116Z cachedir: .pytest_cache
2025-12-04T13:21:31.4924274Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4924319Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4924359Z configfile: pytest.ini
2025-12-04T13:21:31.4924523Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4924598Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4924883Z stepcurrent: skipping 14 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4924928Z Running 1 items in this shard
2025-12-04T13:21:31.4924940Z 
2025-12-04T13:21:31.4925293Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda I1204 13:19:08.677000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 561301
2025-12-04T13:21:31.4925449Z I1204 13:19:08.677000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 561302
2025-12-04T13:21:31.4925611Z I1204 13:19:08.678000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 561303
2025-12-04T13:21:31.4925762Z I1204 13:19:08.679000 561232 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 561304
2025-12-04T13:21:31.4926055Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4926106Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4926687Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4926725Z   _warn_cpu_init()
2025-12-04T13:21:31.4927016Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4927065Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4927637Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4927677Z   _warn_cpu_init()
2025-12-04T13:21:31.4927964Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4928051Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4928376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4928453Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4928737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4928787Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4929379Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4929428Z   _warn_cpu_init()
2025-12-04T13:21:31.4929714Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4929788Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4930071Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:426: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4930136Z   return FSDP(layer, group, **fsdp_kwargs)
2025-12-04T13:21:31.4930706Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4930744Z   _warn_cpu_init()
2025-12-04T13:21:31.4931027Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_fsdp.py:1464: FutureWarning: The `NO_SHARD` sharding strategy is deprecated. If having issues, please use `DistributedDataParallel` instead.
2025-12-04T13:21:31.4931102Z   fsdp_model = FSDP(fsdp_model, self.process_group, **fsdp_kwargs)
2025-12-04T13:21:31.4931331Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4931374Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4931599Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4931643Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4931866Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4931907Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4932128Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict will be returned.
2025-12-04T13:21:31.4932170Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4932403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4932443Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4932664Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4932704Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4932924Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4932963Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4933183Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py:124: UserWarning: When using ``NO_SHARD`` for ``ShardingStrategy``, full_state_dict willbe returned.
2025-12-04T13:21:31.4933223Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4933527Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4933567Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4933721Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4933883Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4934175Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4934341Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4934627Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4934753Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4935032Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4935182Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4935460Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4935608Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4935887Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4936024Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4936304Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4936454Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4936997Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4937114Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4937308Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4937719Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4937849Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4938063Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4938288Z [rank0]:E1204 13:19:14.662000 561301 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4938328Z dist init r=0, world=4
2025-12-04T13:21:31.4938465Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4938641Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4938929Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4939083Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4939371Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4939495Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4939773Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4939921Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4940197Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4940344Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4940621Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4940759Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4941048Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4941197Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4941724Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608.
2025-12-04T13:21:31.4941841Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4942035Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4942461Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4942589Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4942799Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4942973Z [rank3]:E1204 13:19:14.664000 561304 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4943012Z dist init r=3, world=4
2025-12-04T13:21:31.4943149Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4943309Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4943598Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4943752Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4944036Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4944161Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4944438Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4944588Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4944865Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4945014Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4945298Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4945435Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4945712Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4945860Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4946397Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 150016 on device 1. CUDA driver allocated memory was 2317352960 and is now 3483369472.
2025-12-04T13:21:31.4946512Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4946725Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4947139Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4947264Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4947476Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4947640Z [rank1]:E1204 13:19:14.713000 561302 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4947681Z dist init r=1, world=4
2025-12-04T13:21:31.4947818Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4947978Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4948340Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4948496Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4948781Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4948905Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4949183Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4949331Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4949620Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4949768Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4950048Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4950183Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4950461Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4950610Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4951147Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 164352 on device 2. CUDA driver allocated memory was 2300575744 and is now 3466592256.
2025-12-04T13:21:31.4951273Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4951480Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4951892Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4952008Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4952218Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4952382Z [rank2]:E1204 13:19:14.722000 561303 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4952421Z dist init r=2, world=4
2025-12-04T13:21:31.4952761Z [rank0]:[W1204 13:19:14.523981233 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4952799Z FAILED [7.7129s] [100%]
2025-12-04T13:21:31.4952802Z 
2025-12-04T13:21:31.4952860Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4953006Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda _
2025-12-04T13:21:31.4953052Z Traceback (most recent call last):
2025-12-04T13:21:31.4953216Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4953260Z     self._join_processes(fn)
2025-12-04T13:21:31.4953434Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4953487Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4953675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4953718Z     raise RuntimeError(error)
2025-12-04T13:21:31.4953801Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4953846Z Traceback (most recent call last):
2025-12-04T13:21:31.4954008Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4954049Z     getattr(self, test_name)()
2025-12-04T13:21:31.4954206Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4954241Z     fn()
2025-12-04T13:21:31.4954393Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4954433Z     method(*args, **kwargs)
2025-12-04T13:21:31.4954586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4954625Z     method(*args, **kwargs)
2025-12-04T13:21:31.4954784Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4954831Z     with policy():
2025-12-04T13:21:31.4954983Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4955022Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4955425Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4955437Z 
2025-12-04T13:21:31.4955513Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4955797Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4955800Z 
2025-12-04T13:21:31.4955890Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4955892Z 
2025-12-04T13:21:31.4955950Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4955995Z Traceback (most recent call last):
2025-12-04T13:21:31.4956157Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4956200Z     getattr(self, test_name)()
2025-12-04T13:21:31.4956358Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4956393Z     fn()
2025-12-04T13:21:31.4956544Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4956584Z     method(*args, **kwargs)
2025-12-04T13:21:31.4956735Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4956775Z     method(*args, **kwargs)
2025-12-04T13:21:31.4956924Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4956961Z     with policy():
2025-12-04T13:21:31.4957111Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4957153Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4957563Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608.
2025-12-04T13:21:31.4957566Z 
2025-12-04T13:21:31.4957641Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4957923Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4957927Z 
2025-12-04T13:21:31.4958015Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4958017Z 
2025-12-04T13:21:31.4958019Z 
2025-12-04T13:21:31.4958094Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4958220Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4958456Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-7de742e840eb07e8.xml -
2025-12-04T13:21:31.4958517Z =========================== short test summary info ============================
2025-12-04T13:21:31.4958826Z FAILED [7.7129s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.4958885Z Traceback (most recent call last):
2025-12-04T13:21:31.4959047Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4959090Z     getattr(self, test_name)()
2025-12-04T13:21:31.4959262Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4959297Z     fn()
2025-12-04T13:21:31.4959449Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4959488Z     method(*args, **kwargs)
2025-12-04T13:21:31.4959641Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4959680Z     method(*args, **kwargs)
2025-12-04T13:21:31.4959832Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4959870Z     with policy():
2025-12-04T13:21:31.4960021Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4960062Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4960467Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 154112 on device 0. CUDA driver allocated memory was 2453667840 and is now 3619684352.
2025-12-04T13:21:31.4960470Z 
2025-12-04T13:21:31.4960543Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4960822Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4960825Z 
2025-12-04T13:21:31.4960912Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4960914Z 
2025-12-04T13:21:31.4960972Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4961016Z Traceback (most recent call last):
2025-12-04T13:21:31.4961181Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4961221Z     getattr(self, test_name)()
2025-12-04T13:21:31.4961401Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4961434Z     fn()
2025-12-04T13:21:31.4961586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4961626Z     method(*args, **kwargs)
2025-12-04T13:21:31.4961777Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4961815Z     method(*args, **kwargs)
2025-12-04T13:21:31.4961965Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4962003Z     with policy():
2025-12-04T13:21:31.4962157Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4962197Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4962615Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda! Caching allocator allocated memory was 512 and is now reported as 152064 on device 3. CUDA driver allocated memory was 2250244096 and is now 3416260608.
2025-12-04T13:21:31.4962627Z 
2025-12-04T13:21:31.4962701Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4962978Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4962981Z 
2025-12-04T13:21:31.4963078Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4963142Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4963205Z ======================= 1 failed, 18 deselected in 7.85s =======================
2025-12-04T13:21:31.4963243Z Got exit code 1
2025-12-04T13:21:31.4963473Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda
2025-12-04T13:21:31.4963601Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.4963793Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4b555dfb546db2bb.xml
2025-12-04T13:21:31.4963852Z ============================= test session starts ==============================
2025-12-04T13:21:31.4963964Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4964006Z cachedir: .pytest_cache
2025-12-04T13:21:31.4964164Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4964211Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4964251Z configfile: pytest.ini
2025-12-04T13:21:31.4964414Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4964489Z collecting ... collected 60 items / 15 deselected / 45 selected
2025-12-04T13:21:31.4964544Z stepcurrent: skipping 15 already run items.
2025-12-04T13:21:31.4964587Z Running 4 items in this shard
2025-12-04T13:21:31.4964589Z 
2025-12-04T13:21:31.4964943Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 13:19:18.930000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 561703
2025-12-04T13:21:31.4965099Z I1204 13:19:18.931000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 561704
2025-12-04T13:21:31.4965266Z I1204 13:19:18.931000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 561705
2025-12-04T13:21:31.4965417Z I1204 13:19:18.932000 561634 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 561706
2025-12-04T13:21:31.4965996Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4966036Z   _warn_cpu_init()
2025-12-04T13:21:31.4966613Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4966660Z   _warn_cpu_init()
2025-12-04T13:21:31.4967224Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4967272Z   _warn_cpu_init()
2025-12-04T13:21:31.4967839Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4967875Z   _warn_cpu_init()
2025-12-04T13:21:31.4968192Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4968234Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4968379Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4968541Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4968832Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4968988Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4969273Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4969399Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4969695Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4969846Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4970121Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4970269Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4970546Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4970685Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4970976Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4971135Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4971659Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.4971788Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4971987Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4972395Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4972510Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4972722Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4972887Z [rank3]:E1204 13:19:24.846000 561706 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.4972926Z dist init r=3, world=4
2025-12-04T13:21:31.4973065Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4973225Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4973512Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4973666Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4973963Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4974087Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4974363Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4974512Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4974788Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4974936Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4975224Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4975370Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4975647Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4975796Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4976328Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528.
2025-12-04T13:21:31.4976445Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4976640Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4977043Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4977161Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4977372Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4977536Z [rank0]:E1204 13:19:24.867000 561703 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.4977574Z dist init r=0, world=4
2025-12-04T13:21:31.4977713Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4977873Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4978221Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4978375Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4978660Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4978785Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4979060Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4979209Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4979497Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4979645Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4979934Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4980069Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4980371Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4980519Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4981041Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432.
2025-12-04T13:21:31.4981155Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4981352Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4981757Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4981870Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4982081Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4982244Z [rank2]:E1204 13:19:24.897000 561705 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.4982283Z dist init r=2, world=4
2025-12-04T13:21:31.4982420Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4982591Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4982878Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4983033Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4983319Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4983443Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4983731Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4983878Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4984164Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4984310Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.4984595Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4984732Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.4985008Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4985158Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.4985681Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648.
2025-12-04T13:21:31.4985797Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4985992Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4986394Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4986508Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.4986719Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4986895Z [rank1]:E1204 13:19:24.929000 561704 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.4986934Z dist init r=1, world=4
2025-12-04T13:21:31.4987270Z [rank0]:[W1204 13:19:25.808411610 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.4987310Z FAILED [7.5132s] [ 25%]
2025-12-04T13:21:31.4987312Z 
2025-12-04T13:21:31.4987368Z =================================== FAILURES ===================================
2025-12-04T13:21:31.4987509Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _
2025-12-04T13:21:31.4987556Z Traceback (most recent call last):
2025-12-04T13:21:31.4987721Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.4987763Z     self._join_processes(fn)
2025-12-04T13:21:31.4987948Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.4988012Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.4988228Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.4988271Z     raise RuntimeError(error)
2025-12-04T13:21:31.4988352Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4988396Z Traceback (most recent call last):
2025-12-04T13:21:31.4988572Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4988613Z     getattr(self, test_name)()
2025-12-04T13:21:31.4988774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4988807Z     fn()
2025-12-04T13:21:31.4988964Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4989004Z     method(*args, **kwargs)
2025-12-04T13:21:31.4989156Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4989196Z     method(*args, **kwargs)
2025-12-04T13:21:31.4989346Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4989384Z     with policy():
2025-12-04T13:21:31.4989538Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4989578Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4989978Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.4989981Z 
2025-12-04T13:21:31.4990056Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4990333Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4990335Z 
2025-12-04T13:21:31.4990424Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4990427Z 
2025-12-04T13:21:31.4990429Z 
2025-12-04T13:21:31.4990504Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.4990610Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.4990844Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4b555dfb546db2bb.xml -
2025-12-04T13:21:31.4990905Z =========================== short test summary info ============================
2025-12-04T13:21:31.4991198Z FAILED [7.5132s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.4991245Z Traceback (most recent call last):
2025-12-04T13:21:31.4991409Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4991451Z     getattr(self, test_name)()
2025-12-04T13:21:31.4991611Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4991645Z     fn()
2025-12-04T13:21:31.4991817Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4991857Z     method(*args, **kwargs)
2025-12-04T13:21:31.4992022Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4992061Z     method(*args, **kwargs)
2025-12-04T13:21:31.4992211Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.4992247Z     with policy():
2025-12-04T13:21:31.4992398Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.4992454Z     raise RuntimeError(msg)
2025-12-04T13:21:31.4992853Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.4992855Z 
2025-12-04T13:21:31.4992928Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.4993205Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4993207Z 
2025-12-04T13:21:31.4993294Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.4993358Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.4993420Z ======================= 1 failed, 15 deselected in 7.65s =======================
2025-12-04T13:21:31.4993457Z Got exit code 1
2025-12-04T13:21:31.4993497Z Retrying single test...
2025-12-04T13:21:31.4993689Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-078ce9761d4e414e.xml
2025-12-04T13:21:31.4993747Z ============================= test session starts ==============================
2025-12-04T13:21:31.4993858Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.4993900Z cachedir: .pytest_cache
2025-12-04T13:21:31.4994058Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.4994103Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.4994144Z configfile: pytest.ini
2025-12-04T13:21:31.4994307Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.4994382Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.4994664Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.4994708Z Running 1 items in this shard
2025-12-04T13:21:31.4994711Z 
2025-12-04T13:21:31.4995060Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 13:19:29.097000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 562105
2025-12-04T13:21:31.4995215Z I1204 13:19:29.098000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 562106
2025-12-04T13:21:31.4995367Z I1204 13:19:29.098000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 562107
2025-12-04T13:21:31.4995519Z I1204 13:19:29.099000 562036 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 562108
2025-12-04T13:21:31.4996106Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4996154Z   _warn_cpu_init()
2025-12-04T13:21:31.4996723Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4996770Z   _warn_cpu_init()
2025-12-04T13:21:31.4997337Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4997374Z   _warn_cpu_init()
2025-12-04T13:21:31.4997937Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.4997976Z   _warn_cpu_init()
2025-12-04T13:21:31.4998336Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.4998380Z   return func(*args, **kwargs)
2025-12-04T13:21:31.4998522Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.4998684Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.4998990Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.4999149Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.4999433Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.4999560Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.4999838Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.4999988Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5000277Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5000435Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5000711Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5000860Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5001139Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5001289Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5001815Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528.
2025-12-04T13:21:31.5001932Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5002126Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5002532Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5002649Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5002860Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5003025Z [rank0]:E1204 13:19:35.092000 562105 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5003063Z dist init r=0, world=4
2025-12-04T13:21:31.5003212Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5003373Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5003660Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5003814Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5004099Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5004224Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5004513Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5004671Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5004946Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5005092Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5005377Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5005514Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5005790Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5005940Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5006460Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648.
2025-12-04T13:21:31.5006576Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5006772Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5007174Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5007289Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5007510Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5007675Z [rank1]:E1204 13:19:35.104000 562106 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5007714Z dist init r=1, world=4
2025-12-04T13:21:31.5007856Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5008017Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5008531Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5008685Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5008984Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5009108Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5009404Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5009552Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5009842Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5009987Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5010265Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5010401Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5010677Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5010826Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5011346Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 55808 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432.
2025-12-04T13:21:31.5011462Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5011656Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5012071Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5012185Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5012398Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5012563Z [rank2]:E1204 13:19:35.115000 562107 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5012600Z dist init r=2, world=4
2025-12-04T13:21:31.5012738Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5012897Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5013194Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5013347Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5013642Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5013764Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5014051Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5014201Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5014477Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5014625Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5014899Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5015037Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5015315Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5015465Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5015985Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 53760 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.5016100Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5016309Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5016711Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5016824Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5017034Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5017199Z [rank3]:E1204 13:19:35.151000 562108 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5017239Z dist init r=3, world=4
2025-12-04T13:21:31.5019650Z [rank0]:[W1204 13:19:35.936786860 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.5019713Z FAILED [7.6145s] [100%]
2025-12-04T13:21:31.5019716Z 
2025-12-04T13:21:31.5019777Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5019918Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _
2025-12-04T13:21:31.5019966Z Traceback (most recent call last):
2025-12-04T13:21:31.5020144Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5020190Z     self._join_processes(fn)
2025-12-04T13:21:31.5020365Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5020418Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5020599Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5020642Z     raise RuntimeError(error)
2025-12-04T13:21:31.5020725Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5020770Z Traceback (most recent call last):
2025-12-04T13:21:31.5020933Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5020977Z     getattr(self, test_name)()
2025-12-04T13:21:31.5021136Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5021171Z     fn()
2025-12-04T13:21:31.5021324Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5021364Z     method(*args, **kwargs)
2025-12-04T13:21:31.5021519Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5021559Z     method(*args, **kwargs)
2025-12-04T13:21:31.5021710Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5021747Z     with policy():
2025-12-04T13:21:31.5021899Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5021941Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5022356Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528.
2025-12-04T13:21:31.5022358Z 
2025-12-04T13:21:31.5022436Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5022715Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5022718Z 
2025-12-04T13:21:31.5022808Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5022810Z 
2025-12-04T13:21:31.5022812Z 
2025-12-04T13:21:31.5022889Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5022979Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5023218Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-078ce9761d4e414e.xml -
2025-12-04T13:21:31.5023280Z =========================== short test summary info ============================
2025-12-04T13:21:31.5023582Z FAILED [7.6145s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5023639Z Traceback (most recent call last):
2025-12-04T13:21:31.5023803Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5023846Z     getattr(self, test_name)()
2025-12-04T13:21:31.5024006Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5024050Z     fn()
2025-12-04T13:21:31.5024203Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5024244Z     method(*args, **kwargs)
2025-12-04T13:21:31.5024396Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5024435Z     method(*args, **kwargs)
2025-12-04T13:21:31.5024586Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5024622Z     with policy():
2025-12-04T13:21:31.5024774Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5024813Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5025214Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528.
2025-12-04T13:21:31.5025218Z 
2025-12-04T13:21:31.5025292Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5025572Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5025575Z 
2025-12-04T13:21:31.5025663Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5025727Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5025789Z ======================= 1 failed, 18 deselected in 7.75s =======================
2025-12-04T13:21:31.5025828Z Got exit code 1
2025-12-04T13:21:31.5025868Z Retrying single test...
2025-12-04T13:21:31.5026060Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9f2bd3f7b2fc9639.xml
2025-12-04T13:21:31.5026129Z ============================= test session starts ==============================
2025-12-04T13:21:31.5026243Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5026284Z cachedir: .pytest_cache
2025-12-04T13:21:31.5026444Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5026490Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5026531Z configfile: pytest.ini
2025-12-04T13:21:31.5026696Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5026771Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5027042Z stepcurrent: skipping 15 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5027086Z Running 1 items in this shard
2025-12-04T13:21:31.5027088Z 
2025-12-04T13:21:31.5027459Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda I1204 13:19:39.227000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 562507
2025-12-04T13:21:31.5027626Z I1204 13:19:39.228000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 562508
2025-12-04T13:21:31.5027777Z I1204 13:19:39.228000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 562509
2025-12-04T13:21:31.5027937Z I1204 13:19:39.229000 562438 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 562510
2025-12-04T13:21:31.5028559Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5028599Z   _warn_cpu_init()
2025-12-04T13:21:31.5029165Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5029204Z   _warn_cpu_init()
2025-12-04T13:21:31.5029771Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5029807Z   _warn_cpu_init()
2025-12-04T13:21:31.5030377Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5030414Z   _warn_cpu_init()
2025-12-04T13:21:31.5030722Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.5030766Z   return func(*args, **kwargs)
2025-12-04T13:21:31.5030908Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5031071Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5031359Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5031516Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5031813Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5031953Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5032235Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5032399Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5032677Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5032824Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5033100Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5033238Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5033516Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5033666Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5034190Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 49664 on device 1. CUDA driver allocated memory was 2317352960 and is now 3458203648.
2025-12-04T13:21:31.5034307Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5034503Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5034922Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5035037Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5035251Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5035416Z [rank1]:E1204 13:19:45.331000 562508 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5035455Z dist init r=1, world=4
2025-12-04T13:21:31.5035594Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5035753Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5036051Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5036216Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5036502Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5036636Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5036916Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5037066Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5037341Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5037490Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5037764Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5037901Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5038215Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5038365Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5038887Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.5039003Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5039212Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5039619Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5039735Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5039946Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5040111Z [rank3]:E1204 13:19:45.336000 562510 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5040150Z dist init r=3, world=4
2025-12-04T13:21:31.5040299Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5040459Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5040757Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5040913Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5041213Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5041338Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5041617Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5041768Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5042045Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5042192Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5042473Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5042609Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5042886Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5043034Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5043572Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 57856 on device 2. CUDA driver allocated memory was 2300575744 and is now 3441426432.
2025-12-04T13:21:31.5043688Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5043886Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5044290Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5044403Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5044627Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5044791Z [rank2]:E1204 13:19:45.384000 562509 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5044840Z dist init r=2, world=4
2025-12-04T13:21:31.5044977Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5045136Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5045432Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5045586Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5045872Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5045996Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5046275Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5046424Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5046701Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5046849Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5047125Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5047261Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5047540Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5047699Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5048265Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 61952 on device 0. CUDA driver allocated memory was 2453667840 and is now 3594518528.
2025-12-04T13:21:31.5048380Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5048577Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5048994Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5049122Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5049332Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5049496Z [rank0]:E1204 13:19:45.417000 562507 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5049545Z dist init r=0, world=4
2025-12-04T13:21:31.5049885Z [rank0]:[W1204 13:19:45.468384433 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.5049925Z FAILED [7.8138s] [100%]
2025-12-04T13:21:31.5049927Z 
2025-12-04T13:21:31.5049984Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5050126Z _ TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda _
2025-12-04T13:21:31.5050172Z Traceback (most recent call last):
2025-12-04T13:21:31.5050336Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5050378Z     self._join_processes(fn)
2025-12-04T13:21:31.5050553Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5050606Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5050787Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5050830Z     raise RuntimeError(error)
2025-12-04T13:21:31.5050912Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5050957Z Traceback (most recent call last):
2025-12-04T13:21:31.5051118Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5051159Z     getattr(self, test_name)()
2025-12-04T13:21:31.5051317Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5051352Z     fn()
2025-12-04T13:21:31.5051503Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5051544Z     method(*args, **kwargs)
2025-12-04T13:21:31.5051713Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5051753Z     method(*args, **kwargs)
2025-12-04T13:21:31.5051905Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5051944Z     with policy():
2025-12-04T13:21:31.5052095Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5052136Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5052534Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.5052537Z 
2025-12-04T13:21:31.5052613Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5052904Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5052906Z 
2025-12-04T13:21:31.5053006Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5053008Z 
2025-12-04T13:21:31.5053010Z 
2025-12-04T13:21:31.5053086Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5053173Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5053407Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-9f2bd3f7b2fc9639.xml -
2025-12-04T13:21:31.5053479Z =========================== short test summary info ============================
2025-12-04T13:21:31.5053768Z FAILED [7.8138s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5053815Z Traceback (most recent call last):
2025-12-04T13:21:31.5053978Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5054021Z     getattr(self, test_name)()
2025-12-04T13:21:31.5054180Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5054214Z     fn()
2025-12-04T13:21:31.5054366Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5054406Z     method(*args, **kwargs)
2025-12-04T13:21:31.5054557Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5054597Z     method(*args, **kwargs)
2025-12-04T13:21:31.5054748Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5054785Z     with policy():
2025-12-04T13:21:31.5054938Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5054978Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5055376Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda! Caching allocator allocated memory was 512 and is now reported as 51712 on device 3. CUDA driver allocated memory was 2250244096 and is now 3391094784.
2025-12-04T13:21:31.5055379Z 
2025-12-04T13:21:31.5055453Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5055740Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5055743Z 
2025-12-04T13:21:31.5055832Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5055896Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5055958Z ======================= 1 failed, 18 deselected in 7.95s =======================
2025-12-04T13:21:31.5055995Z Got exit code 1
2025-12-04T13:21:31.5056219Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda
2025-12-04T13:21:31.5056347Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.5056538Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8ed907b9022c6610.xml
2025-12-04T13:21:31.5056596Z ============================= test session starts ==============================
2025-12-04T13:21:31.5056720Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5056773Z cachedir: .pytest_cache
2025-12-04T13:21:31.5056930Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5056977Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5057017Z configfile: pytest.ini
2025-12-04T13:21:31.5057179Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5057271Z collecting ... collected 60 items / 16 deselected / 44 selected
2025-12-04T13:21:31.5057325Z stepcurrent: skipping 16 already run items.
2025-12-04T13:21:31.5057368Z Running 3 items in this shard
2025-12-04T13:21:31.5057370Z 
2025-12-04T13:21:31.5057682Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 13:19:49.712000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 562909
2025-12-04T13:21:31.5057836Z I1204 13:19:49.713000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 562910
2025-12-04T13:21:31.5057989Z I1204 13:19:49.714000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 562911
2025-12-04T13:21:31.5058138Z I1204 13:19:49.714000 562840 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 562912
2025-12-04T13:21:31.5058554Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5058604Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5058960Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5059009Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5059360Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5059407Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5059777Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5059823Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5060403Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5060443Z   _warn_cpu_init()
2025-12-04T13:21:31.5061026Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5061064Z   _warn_cpu_init()
2025-12-04T13:21:31.5061640Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5061689Z   _warn_cpu_init()
2025-12-04T13:21:31.5062258Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5062296Z   _warn_cpu_init()
2025-12-04T13:21:31.5062585Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.5062628Z   return func(*args, **kwargs)
2025-12-04T13:21:31.5062770Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5062934Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5063225Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5063380Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5063667Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5063792Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5064071Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5064235Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5064516Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5064665Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5064941Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5065080Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5065366Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5065515Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5066007Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872.
2025-12-04T13:21:31.5066135Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5066336Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5066699Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5066815Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5067027Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5067193Z [rank2]:E1204 13:19:58.999000 562911 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5067232Z dist init r=2, world=4
2025-12-04T13:21:31.5067371Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5067531Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5067819Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5067972Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5068315Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5068459Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5068738Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5068887Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5069162Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5069310Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5069585Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5069737Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5070027Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5070175Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5070670Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5070787Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5070984Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5071345Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5071461Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5071676Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5071840Z [rank0]:E1204 13:19:59.005000 562909 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5071880Z dist init r=0, world=4
2025-12-04T13:21:31.5072017Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5072177Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5072464Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5072620Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5072920Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5073045Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5073325Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5073472Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5073750Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5073909Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5074184Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5074339Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5074615Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5074775Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5075254Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224.
2025-12-04T13:21:31.5075370Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5075565Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5075929Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5076042Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5076255Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5076420Z [rank3]:E1204 13:19:59.047000 562912 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5076457Z dist init r=3, world=4
2025-12-04T13:21:31.5076594Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5076754Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5077058Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5077213Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5077498Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5077623Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5077902Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5078051Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5078380Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5078540Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5078815Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5078965Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5079244Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5079393Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5079870Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088.
2025-12-04T13:21:31.5079985Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5080182Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5080545Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5080659Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5080870Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5081034Z [rank1]:E1204 13:19:59.085000 562910 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5081072Z dist init r=1, world=4
2025-12-04T13:21:31.5081422Z [rank0]:[W1204 13:19:59.959437921 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.5081465Z FAILED [11.5186s] [ 33%]
2025-12-04T13:21:31.5081467Z 
2025-12-04T13:21:31.5081524Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5081626Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____
2025-12-04T13:21:31.5081672Z Traceback (most recent call last):
2025-12-04T13:21:31.5081835Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5081880Z     self._join_processes(fn)
2025-12-04T13:21:31.5082052Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5082106Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5082284Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5082345Z     raise RuntimeError(error)
2025-12-04T13:21:31.5082425Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5082483Z Traceback (most recent call last):
2025-12-04T13:21:31.5082646Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5082688Z     getattr(self, test_name)()
2025-12-04T13:21:31.5082846Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5082891Z     fn()
2025-12-04T13:21:31.5083042Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5083083Z     method(*args, **kwargs)
2025-12-04T13:21:31.5083234Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5083274Z     method(*args, **kwargs)
2025-12-04T13:21:31.5083425Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5083463Z     with policy():
2025-12-04T13:21:31.5083614Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5083656Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5084009Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5084013Z 
2025-12-04T13:21:31.5084089Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5084323Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5084326Z 
2025-12-04T13:21:31.5084415Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5084417Z 
2025-12-04T13:21:31.5084478Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.5084523Z Traceback (most recent call last):
2025-12-04T13:21:31.5084686Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5084728Z     getattr(self, test_name)()
2025-12-04T13:21:31.5084889Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5084922Z     fn()
2025-12-04T13:21:31.5085083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5085122Z     method(*args, **kwargs)
2025-12-04T13:21:31.5085275Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5085315Z     method(*args, **kwargs)
2025-12-04T13:21:31.5085465Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5085502Z     with policy():
2025-12-04T13:21:31.5085653Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5085694Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5086046Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872.
2025-12-04T13:21:31.5086049Z 
2025-12-04T13:21:31.5086122Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5086361Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5086373Z 
2025-12-04T13:21:31.5086462Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5086464Z 
2025-12-04T13:21:31.5086523Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5086568Z Traceback (most recent call last):
2025-12-04T13:21:31.5086739Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5086781Z     getattr(self, test_name)()
2025-12-04T13:21:31.5086940Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5086974Z     fn()
2025-12-04T13:21:31.5087126Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5087166Z     method(*args, **kwargs)
2025-12-04T13:21:31.5087317Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5087356Z     method(*args, **kwargs)
2025-12-04T13:21:31.5087507Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5087543Z     with policy():
2025-12-04T13:21:31.5087695Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5087735Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5088090Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224.
2025-12-04T13:21:31.5088093Z 
2025-12-04T13:21:31.5088206Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5088437Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5088439Z 
2025-12-04T13:21:31.5088527Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5088530Z 
2025-12-04T13:21:31.5088532Z 
2025-12-04T13:21:31.5088608Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5088696Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5088953Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8ed907b9022c6610.xml -
2025-12-04T13:21:31.5089016Z =========================== short test summary info ============================
2025-12-04T13:21:31.5089264Z FAILED [11.5186s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5089312Z Traceback (most recent call last):
2025-12-04T13:21:31.5089476Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5089518Z     getattr(self, test_name)()
2025-12-04T13:21:31.5089678Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5089712Z     fn()
2025-12-04T13:21:31.5089864Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5089904Z     method(*args, **kwargs)
2025-12-04T13:21:31.5090075Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5090127Z     method(*args, **kwargs)
2025-12-04T13:21:31.5090276Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5090314Z     with policy():
2025-12-04T13:21:31.5090465Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5090518Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5090868Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5090872Z 
2025-12-04T13:21:31.5090944Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5091175Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5091178Z 
2025-12-04T13:21:31.5091264Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5091267Z 
2025-12-04T13:21:31.5091325Z Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.5091369Z Traceback (most recent call last):
2025-12-04T13:21:31.5091534Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5091575Z     getattr(self, test_name)()
2025-12-04T13:21:31.5091736Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5091770Z     fn()
2025-12-04T13:21:31.5091922Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5091961Z     method(*args, **kwargs)
2025-12-04T13:21:31.5092112Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5092150Z     method(*args, **kwargs)
2025-12-04T13:21:31.5092300Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5092336Z     with policy():
2025-12-04T13:21:31.5092488Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5092528Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5092894Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872.
2025-12-04T13:21:31.5092896Z 
2025-12-04T13:21:31.5092971Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5093199Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5093202Z 
2025-12-04T13:21:31.5093289Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5093292Z 
2025-12-04T13:21:31.5093350Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5093395Z Traceback (most recent call last):
2025-12-04T13:21:31.5093558Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5093600Z     getattr(self, test_name)()
2025-12-04T13:21:31.5093768Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5093803Z     fn()
2025-12-04T13:21:31.5093963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5094003Z     method(*args, **kwargs)
2025-12-04T13:21:31.5094153Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5094193Z     method(*args, **kwargs)
2025-12-04T13:21:31.5094353Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5094389Z     with policy():
2025-12-04T13:21:31.5094542Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5094583Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5094935Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224.
2025-12-04T13:21:31.5094938Z 
2025-12-04T13:21:31.5095009Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5095238Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5095242Z 
2025-12-04T13:21:31.5095327Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5095394Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5095460Z ====================== 1 failed, 16 deselected in 11.66s =======================
2025-12-04T13:21:31.5095497Z Got exit code 1
2025-12-04T13:21:31.5095536Z Retrying single test...
2025-12-04T13:21:31.5095729Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-84cd3a84cc53b053.xml
2025-12-04T13:21:31.5095787Z ============================= test session starts ==============================
2025-12-04T13:21:31.5095899Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5095940Z cachedir: .pytest_cache
2025-12-04T13:21:31.5096098Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5096145Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5096185Z configfile: pytest.ini
2025-12-04T13:21:31.5096364Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5096439Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5096665Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5096710Z Running 1 items in this shard
2025-12-04T13:21:31.5096712Z 
2025-12-04T13:21:31.5097019Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 13:20:03.663000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 563311
2025-12-04T13:21:31.5097173Z I1204 13:20:03.663000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 563312
2025-12-04T13:21:31.5097326Z I1204 13:20:03.664000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 563313
2025-12-04T13:21:31.5097476Z I1204 13:20:03.664000 563242 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 563314
2025-12-04T13:21:31.5097848Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5097908Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5098300Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5098363Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5098716Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5098763Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5099113Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5099159Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5099737Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5099776Z   _warn_cpu_init()
2025-12-04T13:21:31.5100343Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5100380Z   _warn_cpu_init()
2025-12-04T13:21:31.5100959Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5100999Z   _warn_cpu_init()
2025-12-04T13:21:31.5101564Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5101603Z   _warn_cpu_init()
2025-12-04T13:21:31.5101893Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.5101936Z   return func(*args, **kwargs)
2025-12-04T13:21:31.5102080Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5102255Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5102557Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5102711Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5103008Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5103134Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5103413Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5103564Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5103843Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5103992Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5104268Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5104406Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5104684Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5104833Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5105328Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088.
2025-12-04T13:21:31.5105446Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5105641Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5106004Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5106120Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5106333Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5106508Z [rank1]:E1204 13:20:12.978000 563312 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5106558Z dist init r=1, world=4
2025-12-04T13:21:31.5106697Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5106856Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5107143Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5107312Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5107597Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5107723Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5107998Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5108194Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5108471Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5108620Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5108894Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5109031Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5109310Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5109469Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5109947Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224.
2025-12-04T13:21:31.5110063Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5110258Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5110623Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5110748Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5110960Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5111136Z [rank3]:E1204 13:20:12.985000 563314 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5111175Z dist init r=3, world=4
2025-12-04T13:21:31.5111311Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5111486Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5111773Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5111927Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5112212Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5112336Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5112614Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5112762Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5113041Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5113188Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5113463Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5113600Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5113887Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5114035Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5114512Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5114627Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5114822Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5115192Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5115318Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5115528Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5115704Z [rank0]:E1204 13:20:12.992000 563311 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5115742Z dist init r=0, world=4
2025-12-04T13:21:31.5115880Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5116040Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5116327Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5116480Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5116765Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5116890Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5117165Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5117314Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5117590Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5117738Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5118020Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5118191Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5118469Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5118617Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5119095Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872.
2025-12-04T13:21:31.5119226Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5119423Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5119793Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5119929Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5120142Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5120306Z [rank2]:E1204 13:20:13.024000 563313 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5120345Z dist init r=2, world=4
2025-12-04T13:21:31.5120681Z [rank0]:[W1204 13:20:13.965978040 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.5120722Z FAILED [11.2154s] [100%]
2025-12-04T13:21:31.5120724Z 
2025-12-04T13:21:31.5120781Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5120886Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____
2025-12-04T13:21:31.5120932Z Traceback (most recent call last):
2025-12-04T13:21:31.5121097Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5121140Z     self._join_processes(fn)
2025-12-04T13:21:31.5121314Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5121369Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5121548Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5121591Z     raise RuntimeError(error)
2025-12-04T13:21:31.5121672Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.5121717Z Traceback (most recent call last):
2025-12-04T13:21:31.5121879Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5121922Z     getattr(self, test_name)()
2025-12-04T13:21:31.5122092Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5122128Z     fn()
2025-12-04T13:21:31.5122283Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5122324Z     method(*args, **kwargs)
2025-12-04T13:21:31.5122474Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5122516Z     method(*args, **kwargs)
2025-12-04T13:21:31.5122665Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5122704Z     with policy():
2025-12-04T13:21:31.5122855Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5122897Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5123260Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088.
2025-12-04T13:21:31.5123273Z 
2025-12-04T13:21:31.5123349Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5123580Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5123582Z 
2025-12-04T13:21:31.5123671Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5123683Z 
2025-12-04T13:21:31.5123685Z 
2025-12-04T13:21:31.5123760Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5123848Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5124083Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-84cd3a84cc53b053.xml -
2025-12-04T13:21:31.5124144Z =========================== short test summary info ============================
2025-12-04T13:21:31.5124397Z FAILED [11.2154s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.5124442Z Traceback (most recent call last):
2025-12-04T13:21:31.5124607Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5124650Z     getattr(self, test_name)()
2025-12-04T13:21:31.5124810Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5124844Z     fn()
2025-12-04T13:21:31.5124996Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5125037Z     method(*args, **kwargs)
2025-12-04T13:21:31.5125188Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5125228Z     method(*args, **kwargs)
2025-12-04T13:21:31.5125379Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5125415Z     with policy():
2025-12-04T13:21:31.5125567Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5125609Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5125971Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088.
2025-12-04T13:21:31.5125974Z 
2025-12-04T13:21:31.5126050Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5126280Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5126282Z 
2025-12-04T13:21:31.5126370Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5126432Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5126496Z ====================== 1 failed, 18 deselected in 11.36s =======================
2025-12-04T13:21:31.5126533Z Got exit code 1
2025-12-04T13:21:31.5126574Z Retrying single test...
2025-12-04T13:21:31.5126764Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8fb47012cbd58a54.xml
2025-12-04T13:21:31.5126821Z ============================= test session starts ==============================
2025-12-04T13:21:31.5126942Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5126993Z cachedir: .pytest_cache
2025-12-04T13:21:31.5127149Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5127196Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5127235Z configfile: pytest.ini
2025-12-04T13:21:31.5127398Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5127484Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5127708Z stepcurrent: skipping 16 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5127753Z Running 1 items in this shard
2025-12-04T13:21:31.5127755Z 
2025-12-04T13:21:31.5128062Z distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda I1204 13:20:17.507000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 563713
2025-12-04T13:21:31.5128250Z I1204 13:20:17.507000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 563714
2025-12-04T13:21:31.5128402Z I1204 13:20:17.508000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 563715
2025-12-04T13:21:31.5128554Z I1204 13:20:17.509000 563644 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 563716
2025-12-04T13:21:31.5128913Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5128963Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5129317Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5129364Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5129715Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5129761Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5130126Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5130171Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5130746Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5130785Z   _warn_cpu_init()
2025-12-04T13:21:31.5131364Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5131413Z   _warn_cpu_init()
2025-12-04T13:21:31.5131981Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5132030Z   _warn_cpu_init()
2025-12-04T13:21:31.5132598Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:1014: UserWarning: The passed-in `module` is on CPU and will thus have FSDP's sharding initialization run on CPU, which may be slower than on GPU. We recommend passing in the `device_id` argument for FSDP to move `module` to GPU for the sharding initialization. `module` must also be on GPU device to work with the `sync_module_states=True` flag since that requires GPU communication.
2025-12-04T13:21:31.5132634Z   _warn_cpu_init()
2025-12-04T13:21:31.5132926Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/c10d_logger.py:83: UserWarning: barrier(): using the device under current context. You can specify `device_id` in `init_process_group` to mute this warning.
2025-12-04T13:21:31.5132968Z   return func(*args, **kwargs)
2025-12-04T13:21:31.5133112Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5133275Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5133566Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5133720Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5134009Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5134136Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5134421Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5134572Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5134849Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5134996Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5135273Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5135413Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5135700Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5135865Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5136348Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 295424 on device 2. CUDA driver allocated memory was 2300575744 and is now 3902799872.
2025-12-04T13:21:31.5136475Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5136671Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5137033Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5137147Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5137360Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5137524Z [rank2]:E1204 13:20:26.878000 563715 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5137563Z dist init r=2, world=4
2025-12-04T13:21:31.5137702Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5137862Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5138173Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5138330Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5138632Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5138759Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5139035Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5139183Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5139458Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5139606Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5139893Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5140041Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5140318Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5140481Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5140963Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5141082Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5141280Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5141638Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5141753Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5141966Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5142130Z [rank0]:E1204 13:20:26.883000 563713 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5142168Z dist init r=0, world=4
2025-12-04T13:21:31.5142306Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5142465Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5142753Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5142920Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5143208Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5143334Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5143610Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5143759Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5144034Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5144192Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5144475Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5144611Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5144896Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5145046Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5145526Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 1. CUDA driver allocated memory was 2317352960 and is now 3919577088.
2025-12-04T13:21:31.5145642Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5145839Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5146195Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5146309Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5146522Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5146686Z [rank1]:E1204 13:20:26.888000 563714 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5146825Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5146984Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5147279Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5147432Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5147717Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5147840Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5148120Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5148313Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5148602Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5148762Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5149036Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5149184Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5149462Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5149612Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5150088Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 3. CUDA driver allocated memory was 2250244096 and is now 3852468224.
2025-12-04T13:21:31.5150204Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5150402Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5150759Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5150873Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5151085Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5151251Z [rank3]:E1204 13:20:26.888000 563716 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5151290Z dist init r=1, world=4
2025-12-04T13:21:31.5151339Z dist init r=3, world=4
2025-12-04T13:21:31.5151677Z [rank0]:[W1204 13:20:27.745944812 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
2025-12-04T13:21:31.5151717Z FAILED [11.3165s] [100%]
2025-12-04T13:21:31.5151720Z 
2025-12-04T13:21:31.5151778Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5151878Z ___ TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda ____
2025-12-04T13:21:31.5151923Z Traceback (most recent call last):
2025-12-04T13:21:31.5152087Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5152131Z     self._join_processes(fn)
2025-12-04T13:21:31.5152304Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5152359Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5152555Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5152608Z     raise RuntimeError(error)
2025-12-04T13:21:31.5152689Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5152734Z Traceback (most recent call last):
2025-12-04T13:21:31.5152895Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5152947Z     getattr(self, test_name)()
2025-12-04T13:21:31.5153106Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5153141Z     fn()
2025-12-04T13:21:31.5153294Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5153333Z     method(*args, **kwargs)
2025-12-04T13:21:31.5153486Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5153525Z     method(*args, **kwargs)
2025-12-04T13:21:31.5153676Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5153712Z     with policy():
2025-12-04T13:21:31.5153864Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5153904Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5154259Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5154261Z 
2025-12-04T13:21:31.5154336Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5154568Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5154571Z 
2025-12-04T13:21:31.5154660Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5154663Z 
2025-12-04T13:21:31.5154665Z 
2025-12-04T13:21:31.5154741Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5154830Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5155064Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-8fb47012cbd58a54.xml -
2025-12-04T13:21:31.5155137Z =========================== short test summary info ============================
2025-12-04T13:21:31.5155385Z FAILED [11.3165s] distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5155432Z Traceback (most recent call last):
2025-12-04T13:21:31.5155596Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5155638Z     getattr(self, test_name)()
2025-12-04T13:21:31.5155797Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5155833Z     fn()
2025-12-04T13:21:31.5155984Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5156025Z     method(*args, **kwargs)
2025-12-04T13:21:31.5156177Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5156217Z     method(*args, **kwargs)
2025-12-04T13:21:31.5156376Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5156423Z     with policy():
2025-12-04T13:21:31.5156575Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5156616Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5156968Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda! Caching allocator allocated memory was 512 and is now reported as 227840 on device 0. CUDA driver allocated memory was 2453667840 and is now 4055891968.
2025-12-04T13:21:31.5156980Z 
2025-12-04T13:21:31.5157055Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5157290Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParityWithDDPCUDA.test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5157292Z 
2025-12-04T13:21:31.5157379Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5157444Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5157506Z ====================== 1 failed, 18 deselected in 11.46s =======================
2025-12-04T13:21:31.5157543Z Got exit code 1
2025-12-04T13:21:31.5157722Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda
2025-12-04T13:21:31.5157851Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.5158042Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aedcf6ff3ae4698.xml
2025-12-04T13:21:31.5158100Z ============================= test session starts ==============================
2025-12-04T13:21:31.5158258Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5158301Z cachedir: .pytest_cache
2025-12-04T13:21:31.5158458Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5158503Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5158544Z configfile: pytest.ini
2025-12-04T13:21:31.5158705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5158781Z collecting ... collected 60 items / 17 deselected / 43 selected
2025-12-04T13:21:31.5158835Z stepcurrent: skipping 17 already run items.
2025-12-04T13:21:31.5158879Z Running 2 items in this shard
2025-12-04T13:21:31.5158881Z 
2025-12-04T13:21:31.5159197Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:20:31.409000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 564115
2025-12-04T13:21:31.5159353Z I1204 13:20:31.409000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 564116
2025-12-04T13:21:31.5159507Z I1204 13:20:31.410000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 564117
2025-12-04T13:21:31.5159657Z I1204 13:20:31.410000 564046 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 564118
2025-12-04T13:21:31.5160017Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5160066Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5160370Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5160448Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5160555Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5160630Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5161125Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5161200Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5161556Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5161605Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5161894Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5161959Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5162063Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5162138Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5162628Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5162691Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5163043Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5163092Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5163389Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5163454Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5163556Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5163629Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5164121Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5164181Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5164548Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5164594Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5164891Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5164953Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5165055Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5165136Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5165624Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5165683Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5165827Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5165990Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5166279Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5166437Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5166726Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5166852Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5167131Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5167280Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5167575Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5167723Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5167999Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5168136Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5168460Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5168611Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5169100Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536.
2025-12-04T13:21:31.5169232Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5169427Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5169789Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5169903Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5170117Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5170282Z [rank3]:E1204 13:20:37.307000 564118 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5170320Z dist init r=3, world=4
2025-12-04T13:21:31.5170459Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5170619Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5170907Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5171062Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5171349Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5171474Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5171750Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5171912Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5172187Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5172336Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5172610Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5172748Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5173026Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5173184Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5173662Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400.
2025-12-04T13:21:31.5173787Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5173985Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5174330Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5174446Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5174658Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5174823Z [rank1]:E1204 13:20:37.365000 564116 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5174862Z dist init r=1, world=4
2025-12-04T13:21:31.5175001Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5175161Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5175447Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5175602Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5175887Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5176014Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5176301Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5176450Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5176725Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5176872Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5177147Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5177292Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5177570Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5177728Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5178245Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280.
2025-12-04T13:21:31.5178373Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5178571Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5178918Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5179031Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5179244Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5179409Z [rank0]:E1204 13:20:37.366000 564115 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5179449Z dist init r=0, world=4
2025-12-04T13:21:31.5179586Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5179746Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5180033Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5180187Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5180484Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5180610Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5180887Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5181033Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5181311Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5181460Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5181746Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5181895Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5182171Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5182337Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5182804Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184.
2025-12-04T13:21:31.5182921Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5183116Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5183459Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5183575Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5183786Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5183951Z [rank2]:E1204 13:20:37.367000 564117 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5183989Z dist init r=2, world=4
2025-12-04T13:21:31.5184027Z FAILED [6.9142s] [ 50%]
2025-12-04T13:21:31.5184029Z 
2025-12-04T13:21:31.5184086Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5184183Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______
2025-12-04T13:21:31.5184229Z Traceback (most recent call last):
2025-12-04T13:21:31.5184392Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5184458Z     self._join_processes(fn)
2025-12-04T13:21:31.5184631Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5184686Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5184865Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5184909Z     raise RuntimeError(error)
2025-12-04T13:21:31.5184989Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5185034Z Traceback (most recent call last):
2025-12-04T13:21:31.5185197Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5185240Z     getattr(self, test_name)()
2025-12-04T13:21:31.5185399Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5185434Z     fn()
2025-12-04T13:21:31.5185595Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5185636Z     method(*args, **kwargs)
2025-12-04T13:21:31.5185798Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5185838Z     method(*args, **kwargs)
2025-12-04T13:21:31.5185988Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5186026Z     with policy():
2025-12-04T13:21:31.5186187Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5186230Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5186574Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536.
2025-12-04T13:21:31.5186578Z 
2025-12-04T13:21:31.5186653Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5186873Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5186875Z 
2025-12-04T13:21:31.5186964Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5186966Z 
2025-12-04T13:21:31.5186968Z 
2025-12-04T13:21:31.5187045Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5187132Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5187368Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-5aedcf6ff3ae4698.xml -
2025-12-04T13:21:31.5187429Z =========================== short test summary info ============================
2025-12-04T13:21:31.5187668Z FAILED [6.9142s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5187715Z Traceback (most recent call last):
2025-12-04T13:21:31.5187880Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5187922Z     getattr(self, test_name)()
2025-12-04T13:21:31.5188083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5188118Z     fn()
2025-12-04T13:21:31.5188320Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5188361Z     method(*args, **kwargs)
2025-12-04T13:21:31.5188513Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5188553Z     method(*args, **kwargs)
2025-12-04T13:21:31.5188703Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5188740Z     with policy():
2025-12-04T13:21:31.5188891Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5188932Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5189275Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536.
2025-12-04T13:21:31.5189277Z 
2025-12-04T13:21:31.5189352Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5189583Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5189597Z 
2025-12-04T13:21:31.5189685Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5189748Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5189810Z ======================= 1 failed, 17 deselected in 7.06s =======================
2025-12-04T13:21:31.5189860Z Got exit code 1
2025-12-04T13:21:31.5189900Z Retrying single test...
2025-12-04T13:21:31.5190089Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4348f9a325949f23.xml
2025-12-04T13:21:31.5190147Z ============================= test session starts ==============================
2025-12-04T13:21:31.5190261Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5190302Z cachedir: .pytest_cache
2025-12-04T13:21:31.5190461Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5190507Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5190547Z configfile: pytest.ini
2025-12-04T13:21:31.5190709Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5190785Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5191000Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5191044Z Running 1 items in this shard
2025-12-04T13:21:31.5191047Z 
2025-12-04T13:21:31.5191341Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:20:40.948000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 564509
2025-12-04T13:21:31.5191497Z I1204 13:20:40.949000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 564510
2025-12-04T13:21:31.5191647Z I1204 13:20:40.950000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 564511
2025-12-04T13:21:31.5191798Z I1204 13:20:40.950000 564440 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 564512
2025-12-04T13:21:31.5192175Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5192224Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5192516Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5192581Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5192687Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5192761Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5193253Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5193315Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5193680Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5193740Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5194090Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5194148Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5194438Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5194503Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5194606Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5194680Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5194966Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5195028Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5195132Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5195205Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5195697Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5195757Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5196241Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5196299Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5196662Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5196709Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5197000Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5197062Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5197163Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5197236Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5197732Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5197792Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5197951Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5198115Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5198441Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5198614Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5198903Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5199028Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5199308Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5199458Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5199735Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5199884Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5200161Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5200299Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5200576Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5200738Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5201206Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280.
2025-12-04T13:21:31.5201324Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5201521Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5201871Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5202001Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5202212Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5202399Z [rank0]:E1204 13:20:47.106000 564509 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5202437Z dist init r=0, world=4
2025-12-04T13:21:31.5202576Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5202748Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5203035Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5203190Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5203475Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5203600Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5203877Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5204026Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5204302Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5204450Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5204723Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5204861Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5205150Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5205299Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5205765Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536.
2025-12-04T13:21:31.5205880Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5206078Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5206437Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5206564Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5206777Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5206950Z [rank3]:E1204 13:20:47.111000 564512 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5206989Z dist init r=3, world=4
2025-12-04T13:21:31.5207128Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5207289Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5207574Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5207730Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5208013Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5208140Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5208454Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5208604Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5208879Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5209026Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5209312Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5209449Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5209726Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5209877Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5210340Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400.
2025-12-04T13:21:31.5210456Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5210664Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5211026Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5211139Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5211362Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5211527Z [rank1]:E1204 13:20:47.113000 564510 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5211566Z dist init r=1, world=4
2025-12-04T13:21:31.5211704Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5211864Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5212150Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5212304Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5212589Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5212713Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5212992Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5213141Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5213418Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5213576Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5213852Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5213989Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5214265Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5214415Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5214900Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184.
2025-12-04T13:21:31.5215023Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5215218Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5215568Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5215698Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5215909Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5216074Z [rank2]:E1204 13:20:47.116000 564511 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5216113Z dist init r=2, world=4
2025-12-04T13:21:31.5216151Z FAILED [7.4173s] [100%]
2025-12-04T13:21:31.5216156Z 
2025-12-04T13:21:31.5216212Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5216309Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______
2025-12-04T13:21:31.5216356Z Traceback (most recent call last):
2025-12-04T13:21:31.5216519Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5216563Z     self._join_processes(fn)
2025-12-04T13:21:31.5216736Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5216791Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5216969Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5217012Z     raise RuntimeError(error)
2025-12-04T13:21:31.5217092Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.5217137Z Traceback (most recent call last):
2025-12-04T13:21:31.5217298Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5217341Z     getattr(self, test_name)()
2025-12-04T13:21:31.5217509Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5217544Z     fn()
2025-12-04T13:21:31.5217698Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5217740Z     method(*args, **kwargs)
2025-12-04T13:21:31.5217892Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5217933Z     method(*args, **kwargs)
2025-12-04T13:21:31.5218083Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5218120Z     with policy():
2025-12-04T13:21:31.5218307Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5218347Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5218702Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400.
2025-12-04T13:21:31.5218704Z 
2025-12-04T13:21:31.5218791Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5219010Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5219013Z 
2025-12-04T13:21:31.5219101Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5219115Z 
2025-12-04T13:21:31.5219117Z 
2025-12-04T13:21:31.5219195Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5219283Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5219516Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4348f9a325949f23.xml -
2025-12-04T13:21:31.5219577Z =========================== short test summary info ============================
2025-12-04T13:21:31.5219812Z FAILED [7.4173s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.5219859Z Traceback (most recent call last):
2025-12-04T13:21:31.5220022Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5220065Z     getattr(self, test_name)()
2025-12-04T13:21:31.5220225Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5220260Z     fn()
2025-12-04T13:21:31.5220412Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5220452Z     method(*args, **kwargs)
2025-12-04T13:21:31.5220603Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5220643Z     method(*args, **kwargs)
2025-12-04T13:21:31.5220792Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5220829Z     with policy():
2025-12-04T13:21:31.5220980Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5221022Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5221375Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400.
2025-12-04T13:21:31.5221378Z 
2025-12-04T13:21:31.5221452Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5221673Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5221676Z 
2025-12-04T13:21:31.5221763Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5221826Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5221886Z ======================= 1 failed, 18 deselected in 7.56s =======================
2025-12-04T13:21:31.5221924Z Got exit code 1
2025-12-04T13:21:31.5221964Z Retrying single test...
2025-12-04T13:21:31.5222152Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-140229431b9f8263.xml
2025-12-04T13:21:31.5222209Z ============================= test session starts ==============================
2025-12-04T13:21:31.5222331Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5222372Z cachedir: .pytest_cache
2025-12-04T13:21:31.5222541Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5222587Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5222627Z configfile: pytest.ini
2025-12-04T13:21:31.5222788Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5222875Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5223087Z stepcurrent: skipping 17 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5223132Z Running 1 items in this shard
2025-12-04T13:21:31.5223134Z 
2025-12-04T13:21:31.5223430Z distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda I1204 13:20:50.957000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 564903
2025-12-04T13:21:31.5223585Z I1204 13:20:50.958000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 564904
2025-12-04T13:21:31.5223737Z I1204 13:20:50.958000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 564905
2025-12-04T13:21:31.5223888Z I1204 13:20:50.959000 564834 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 564906
2025-12-04T13:21:31.5224249Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5224297Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5224589Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5224655Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5224761Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5224835Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5225342Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5225405Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5225759Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5225807Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5226093Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5226157Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5226260Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5226335Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5226832Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5226904Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5227261Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5227317Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5227604Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5227667Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5227770Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5227842Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5228376Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5228437Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5228791Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5228839Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5229125Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_wrap_utils.py:64: UserWarning: Both mixed precision and an auto_wrap_policy were specified to FSDP, where the wrapped module has submodules of type:
2025-12-04T13:21:31.5229187Z {<class 'torch.nn.modules.batchnorm.BatchNorm1d'>}
2025-12-04T13:21:31.5229288Z These modules will be wrapped as separate FSDP instacnes with mixed precision disabled.
2025-12-04T13:21:31.5229362Z   _warn_on_overridden_mixed_precision(overridden_module_classes)
2025-12-04T13:21:31.5229878Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5229939Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5230083Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5230247Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5230536Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5230692Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5230989Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5231127Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5231405Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5231566Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5231848Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5231997Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5232273Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5232411Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5232688Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5232838Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5233305Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280.
2025-12-04T13:21:31.5233422Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5233619Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5233976Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5234092Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5234306Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5234475Z [rank0]:E1204 13:20:56.840000 564903 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5234513Z dist init r=0, world=4
2025-12-04T13:21:31.5234652Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5234812Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5235114Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5235268Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5235564Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5235690Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5235975Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5236124Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5236403Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5236554Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5236831Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5236968Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5237249Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5237398Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5237865Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 1. CUDA driver allocated memory was 2317352960 and is now 3040870400.
2025-12-04T13:21:31.5237982Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5238234Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5238581Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5238695Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5238908Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5239073Z [rank1]:E1204 13:20:56.850000 564904 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5239113Z dist init r=1, world=4
2025-12-04T13:21:31.5239251Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5239424Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5239710Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5239876Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5240161Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5240299Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5240576Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5240724Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5240999Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5241148Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5241426Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5241563Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5241839Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5241988Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5242463Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 2. CUDA driver allocated memory was 2300575744 and is now 3024093184.
2025-12-04T13:21:31.5242579Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5242774Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5243120Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5243275Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5243507Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5243698Z [rank2]:E1204 13:20:56.891000 564905 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5246097Z dist init r=2, world=4
2025-12-04T13:21:31.5246253Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5246425Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5246714Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5246878Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5247164Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5247287Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5247565Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5247711Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5247992Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5248248Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5248527Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5248663Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5248941Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5249090Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5249580Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 3. CUDA driver allocated memory was 2250244096 and is now 2973761536.
2025-12-04T13:21:31.5249697Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5249892Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5250238Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5250352Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5250575Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5250754Z [rank3]:E1204 13:20:56.893000 564906 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5250792Z dist init r=3, world=4
2025-12-04T13:21:31.5250831Z FAILED [6.9131s] [100%]
2025-12-04T13:21:31.5250834Z 
2025-12-04T13:21:31.5250892Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5251005Z ______ TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda _______
2025-12-04T13:21:31.5251052Z Traceback (most recent call last):
2025-12-04T13:21:31.5251216Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5251260Z     self._join_processes(fn)
2025-12-04T13:21:31.5251433Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5251489Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5251665Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5251709Z     raise RuntimeError(error)
2025-12-04T13:21:31.5251790Z RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5251835Z Traceback (most recent call last):
2025-12-04T13:21:31.5251998Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5252040Z     getattr(self, test_name)()
2025-12-04T13:21:31.5252198Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5252233Z     fn()
2025-12-04T13:21:31.5252385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5252427Z     method(*args, **kwargs)
2025-12-04T13:21:31.5252577Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5252616Z     method(*args, **kwargs)
2025-12-04T13:21:31.5252765Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5252803Z     with policy():
2025-12-04T13:21:31.5252956Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5252997Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5253349Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280.
2025-12-04T13:21:31.5253352Z 
2025-12-04T13:21:31.5253428Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5253649Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5253651Z 
2025-12-04T13:21:31.5253739Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5253743Z 
2025-12-04T13:21:31.5253744Z 
2025-12-04T13:21:31.5253822Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5253909Z Process 0 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5254143Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-140229431b9f8263.xml -
2025-12-04T13:21:31.5254213Z =========================== short test summary info ============================
2025-12-04T13:21:31.5254463Z FAILED [6.9131s] distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda - RuntimeError: Process 0 exited with error code 10 and exception:
2025-12-04T13:21:31.5254508Z Traceback (most recent call last):
2025-12-04T13:21:31.5254673Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5254726Z     getattr(self, test_name)()
2025-12-04T13:21:31.5254884Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5254919Z     fn()
2025-12-04T13:21:31.5255072Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5255112Z     method(*args, **kwargs)
2025-12-04T13:21:31.5255266Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5255307Z     method(*args, **kwargs)
2025-12-04T13:21:31.5255458Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5255497Z     with policy():
2025-12-04T13:21:31.5255648Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5255690Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5256032Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda! Caching allocator allocated memory was 512 and is now reported as 22528 on device 0. CUDA driver allocated memory was 2453667840 and is now 3177185280.
2025-12-04T13:21:31.5256034Z 
2025-12-04T13:21:31.5256110Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5256328Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestNoGradCUDA.test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5256332Z 
2025-12-04T13:21:31.5256419Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5256482Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5256543Z ======================= 1 failed, 18 deselected in 7.05s =======================
2025-12-04T13:21:31.5256582Z Got exit code 1
2025-12-04T13:21:31.5256749Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda
2025-12-04T13:21:31.5256891Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.5257080Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a6879dc84d5f9c6e.xml
2025-12-04T13:21:31.5257139Z ============================= test session starts ==============================
2025-12-04T13:21:31.5257252Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5257295Z cachedir: .pytest_cache
2025-12-04T13:21:31.5257454Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5257500Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5257540Z configfile: pytest.ini
2025-12-04T13:21:31.5257705Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5257780Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5257833Z stepcurrent: skipping 18 already run items.
2025-12-04T13:21:31.5257876Z Running 1 items in this shard
2025-12-04T13:21:31.5257878Z 
2025-12-04T13:21:31.5258257Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 13:21:00.407000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 565297
2025-12-04T13:21:31.5258424Z I1204 13:21:00.408000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 565298
2025-12-04T13:21:31.5258578Z I1204 13:21:00.408000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 565299
2025-12-04T13:21:31.5258750Z I1204 13:21:00.409000 565228 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 565300
2025-12-04T13:21:31.5259113Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5259164Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5259659Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5259724Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5260080Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5260129Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5260617Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5260679Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5261032Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5261078Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5261581Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5261641Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5261993Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5262040Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5262542Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5262610Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5262753Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5262917Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5263207Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5263373Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5263661Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5263786Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5264066Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5264214Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5264493Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5264641Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5264918Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5265055Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5265331Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5265490Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5265972Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328.
2025-12-04T13:21:31.5266090Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5266284Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5266646Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5266771Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5266983Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5267159Z [rank1]:E1204 13:21:06.831000 565298 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5267197Z dist init r=1, world=4
2025-12-04T13:21:31.5267335Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5267503Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5267792Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5267946Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5268280Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5268405Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5268682Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5268830Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5269107Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5269257Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5269534Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5269672Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5269964Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5270112Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5270593Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112.
2025-12-04T13:21:31.5270708Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5270905Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5271274Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5271401Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5271614Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5271792Z [rank2]:E1204 13:21:06.840000 565299 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5271832Z dist init r=2, world=4
2025-12-04T13:21:31.5271969Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5272131Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5272416Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5272571Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5272856Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5272987Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5273264Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5273411Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5273686Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5273834Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5274122Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5274258Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5274536Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5274684Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5275162Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5275286Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5275480Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5275853Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5275976Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5276189Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5276356Z [rank3]:E1204 13:21:06.887000 565300 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5276394Z dist init r=3, world=4
2025-12-04T13:21:31.5276533Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5276691Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5276978Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5277132Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5277417Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5277541Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5277818Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5277965Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5278286Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5278434Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5278710Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5278848Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5279124Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5279272Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5279763Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208.
2025-12-04T13:21:31.5279888Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5280083Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5280455Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5280568Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5280779Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5280944Z [rank0]:E1204 13:21:06.896000 565297 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5280981Z dist init r=0, world=4
2025-12-04T13:21:31.5281019Z FAILED [7.5117s] [100%]
2025-12-04T13:21:31.5281021Z 
2025-12-04T13:21:31.5281079Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5281178Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___
2025-12-04T13:21:31.5281224Z Traceback (most recent call last):
2025-12-04T13:21:31.5281385Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5281429Z     self._join_processes(fn)
2025-12-04T13:21:31.5281602Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5281657Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5281833Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5281877Z     raise RuntimeError(error)
2025-12-04T13:21:31.5281958Z RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.5282005Z Traceback (most recent call last):
2025-12-04T13:21:31.5282165Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5282208Z     getattr(self, test_name)()
2025-12-04T13:21:31.5282374Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5282409Z     fn()
2025-12-04T13:21:31.5282561Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5282602Z     method(*args, **kwargs)
2025-12-04T13:21:31.5282752Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5282791Z     method(*args, **kwargs)
2025-12-04T13:21:31.5282942Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5282978Z     with policy():
2025-12-04T13:21:31.5283130Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5283170Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5283534Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328.
2025-12-04T13:21:31.5283546Z 
2025-12-04T13:21:31.5283622Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5283854Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5283856Z 
2025-12-04T13:21:31.5283955Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5283958Z 
2025-12-04T13:21:31.5283959Z 
2025-12-04T13:21:31.5284035Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5284123Z Process 1 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5284359Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-a6879dc84d5f9c6e.xml -
2025-12-04T13:21:31.5284418Z =========================== short test summary info ============================
2025-12-04T13:21:31.5284667Z FAILED [7.5117s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 1 exited with error code 10 and exception:
2025-12-04T13:21:31.5284713Z Traceback (most recent call last):
2025-12-04T13:21:31.5284875Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5284919Z     getattr(self, test_name)()
2025-12-04T13:21:31.5285078Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5285113Z     fn()
2025-12-04T13:21:31.5285263Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5285304Z     method(*args, **kwargs)
2025-12-04T13:21:31.5285454Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5285495Z     method(*args, **kwargs)
2025-12-04T13:21:31.5285644Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5285681Z     with policy():
2025-12-04T13:21:31.5285831Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5285872Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5286233Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328.
2025-12-04T13:21:31.5286236Z 
2025-12-04T13:21:31.5286312Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5286543Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5286545Z 
2025-12-04T13:21:31.5286632Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5286695Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5286756Z ======================= 1 failed, 18 deselected in 7.65s =======================
2025-12-04T13:21:31.5286792Z Got exit code 1
2025-12-04T13:21:31.5286832Z Retrying single test...
2025-12-04T13:21:31.5287022Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2f6c6c23e79f8289.xml
2025-12-04T13:21:31.5287089Z ============================= test session starts ==============================
2025-12-04T13:21:31.5287202Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5287254Z cachedir: .pytest_cache
2025-12-04T13:21:31.5287413Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5287459Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5287498Z configfile: pytest.ini
2025-12-04T13:21:31.5287662Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5287747Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5287972Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5288017Z Running 1 items in this shard
2025-12-04T13:21:31.5288019Z 
2025-12-04T13:21:31.5288355Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 13:21:10.483000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 565675
2025-12-04T13:21:31.5288510Z I1204 13:21:10.484000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 565676
2025-12-04T13:21:31.5288662Z I1204 13:21:10.484000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 565677
2025-12-04T13:21:31.5288813Z I1204 13:21:10.485000 565606 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 565678
2025-12-04T13:21:31.5289175Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5289223Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5289717Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5289780Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5290153Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5290201Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5290689Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5290750Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5291101Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5291148Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5291519Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5291576Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5292062Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5292133Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5292692Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5292751Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5292896Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5293058Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5293348Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5293504Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5293794Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5293920Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5294197Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5294346Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5294634Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5294781Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5295061Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5295197Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5295475Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5295624Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5296115Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112.
2025-12-04T13:21:31.5296243Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5296437Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5296808Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5296923Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5297134Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5297299Z [rank2]:E1204 13:21:16.881000 565677 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5297339Z dist init r=2, world=4
2025-12-04T13:21:31.5297477Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5297638Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5297926Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5298079Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5298430Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5298555Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5298831Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5298993Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5299269Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5299418Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5299695Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5299832Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5300121Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5300271Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5300762Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5300890Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5301086Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5301445Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5301560Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5301770Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5301936Z [rank3]:E1204 13:21:16.887000 565678 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5301976Z dist init r=3, world=4
2025-12-04T13:21:31.5302114Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5302273Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5302559Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5302712Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5303045Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5303184Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5303461Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5303609Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5303884Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5304031Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5304309Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5304456Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5304742Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5304890Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5305378Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208.
2025-12-04T13:21:31.5305494Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5305687Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5306045Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5306159Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5306370Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5306536Z [rank0]:E1204 13:21:16.974000 565675 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5306574Z dist init r=0, world=4
2025-12-04T13:21:31.5306714Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5306873Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5307162Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5307315Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5307614Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5307739Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5308014Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5308199Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5308476Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5308635Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5308911Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5309061Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5309337Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5309499Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5309975Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328.
2025-12-04T13:21:31.5310090Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5310284Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5310640Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5310754Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5310965Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5311130Z [rank1]:E1204 13:21:16.977000 565676 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5311168Z dist init r=1, world=4
2025-12-04T13:21:31.5311206Z FAILED [7.5141s] [100%]
2025-12-04T13:21:31.5311208Z 
2025-12-04T13:21:31.5311266Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5311366Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___
2025-12-04T13:21:31.5311412Z Traceback (most recent call last):
2025-12-04T13:21:31.5311585Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5311629Z     self._join_processes(fn)
2025-12-04T13:21:31.5311802Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5311857Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5312033Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5312077Z     raise RuntimeError(error)
2025-12-04T13:21:31.5312157Z RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.5312203Z Traceback (most recent call last):
2025-12-04T13:21:31.5312364Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5312407Z     getattr(self, test_name)()
2025-12-04T13:21:31.5312565Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5312600Z     fn()
2025-12-04T13:21:31.5312763Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5312813Z     method(*args, **kwargs)
2025-12-04T13:21:31.5312963Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5313003Z     method(*args, **kwargs)
2025-12-04T13:21:31.5313153Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5313201Z     with policy():
2025-12-04T13:21:31.5313353Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5313393Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5313747Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112.
2025-12-04T13:21:31.5313750Z 
2025-12-04T13:21:31.5313825Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5314057Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5314059Z 
2025-12-04T13:21:31.5314147Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5314150Z 
2025-12-04T13:21:31.5314209Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5314254Z Traceback (most recent call last):
2025-12-04T13:21:31.5314418Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5314459Z     getattr(self, test_name)()
2025-12-04T13:21:31.5314618Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5314652Z     fn()
2025-12-04T13:21:31.5314803Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5314842Z     method(*args, **kwargs)
2025-12-04T13:21:31.5314993Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5315034Z     method(*args, **kwargs)
2025-12-04T13:21:31.5315183Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5315221Z     with policy():
2025-12-04T13:21:31.5315381Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5315422Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5315776Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5315779Z 
2025-12-04T13:21:31.5315852Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5316082Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5316085Z 
2025-12-04T13:21:31.5316173Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5316175Z 
2025-12-04T13:21:31.5316177Z 
2025-12-04T13:21:31.5316253Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5316351Z Process 2 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5316585Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-2f6c6c23e79f8289.xml -
2025-12-04T13:21:31.5316654Z =========================== short test summary info ============================
2025-12-04T13:21:31.5316903Z FAILED [7.5141s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 2 exited with error code 10 and exception:
2025-12-04T13:21:31.5316959Z Traceback (most recent call last):
2025-12-04T13:21:31.5317122Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5317163Z     getattr(self, test_name)()
2025-12-04T13:21:31.5317323Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5317356Z     fn()
2025-12-04T13:21:31.5317509Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5317549Z     method(*args, **kwargs)
2025-12-04T13:21:31.5317699Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5317737Z     method(*args, **kwargs)
2025-12-04T13:21:31.5317887Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5317924Z     with policy():
2025-12-04T13:21:31.5318076Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5318116Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5318514Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112.
2025-12-04T13:21:31.5318517Z 
2025-12-04T13:21:31.5318591Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5318820Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5318823Z 
2025-12-04T13:21:31.5318911Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5318913Z 
2025-12-04T13:21:31.5318970Z Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5319016Z Traceback (most recent call last):
2025-12-04T13:21:31.5319193Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5319235Z     getattr(self, test_name)()
2025-12-04T13:21:31.5319395Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5319429Z     fn()
2025-12-04T13:21:31.5319579Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5319619Z     method(*args, **kwargs)
2025-12-04T13:21:31.5319768Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5319808Z     method(*args, **kwargs)
2025-12-04T13:21:31.5319957Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5319994Z     with policy():
2025-12-04T13:21:31.5320146Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5320187Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5320554Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5320568Z 
2025-12-04T13:21:31.5320640Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5320868Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5320893Z 
2025-12-04T13:21:31.5320979Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5321044Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5321105Z ======================= 1 failed, 18 deselected in 7.65s =======================
2025-12-04T13:21:31.5321144Z Got exit code 1
2025-12-04T13:21:31.5321183Z Retrying single test...
2025-12-04T13:21:31.5321374Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4d126ec424ab47b8.xml
2025-12-04T13:21:31.5321431Z ============================= test session starts ==============================
2025-12-04T13:21:31.5321543Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5321583Z cachedir: .pytest_cache
2025-12-04T13:21:31.5321743Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5321789Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5321829Z configfile: pytest.ini
2025-12-04T13:21:31.5321992Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5322067Z collecting ... collected 60 items / 18 deselected / 42 selected
2025-12-04T13:21:31.5322291Z stepcurrent: skipping 18 already run items. Running only test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5322335Z Running 1 items in this shard
2025-12-04T13:21:31.5322337Z 
2025-12-04T13:21:31.5322646Z distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda I1204 13:21:20.916000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 0 with pid 566053
2025-12-04T13:21:31.5322802Z I1204 13:21:20.917000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 1 with pid 566054
2025-12-04T13:21:31.5322964Z I1204 13:21:20.917000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 2 with pid 566055
2025-12-04T13:21:31.5323115Z I1204 13:21:20.918000 565984 site-packages/torch/testing/_internal/common_distributed.py:849] Started process 3 with pid 566056
2025-12-04T13:21:31.5323474Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5323523Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5324016Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 3, which does not have an explicit index. FSDP will use the current device 3. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5324080Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5324442Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5324500Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5324989Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 2, which does not have an explicit index. FSDP will use the current device 2. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5325061Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5325415Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5325461Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5325948Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 1, which does not have an explicit index. FSDP will use the current device 1. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5326007Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5326359Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/nn/modules/transformer.py:144: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
2025-12-04T13:21:31.5326404Z   self.encoder = TransformerEncoder(
2025-12-04T13:21:31.5326891Z /opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/distributed/fsdp/_init_utils.py:571: UserWarning: FSDP got the argument `device_id` cuda on rank 0, which does not have an explicit index. FSDP will use the current device 0. If this is incorrect, please explicitly call `torch.cuda.set_device()` before FSDP initialization or pass in the explicit device index as the `device_id` argument.
2025-12-04T13:21:31.5326951Z   device_from_device_id = _get_device_from_device_id(
2025-12-04T13:21:31.5327095Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5327257Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5327559Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5327715Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5328000Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5328125Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5328447Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5328596Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5328889Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5329048Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5329322Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5329471Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5329751Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5329901Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5330386Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5330503Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5330699Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5331061Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5331176Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5331388Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5331554Z [rank3]:E1204 13:21:27.293000 566056 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 3 with exit code: 10
2025-12-04T13:21:31.5331591Z dist init r=3, world=4
2025-12-04T13:21:31.5331743Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5331902Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5332192Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5332345Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5332630Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5332755Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5333041Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5333198Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5333473Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5333633Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5333908Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5334046Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5334322Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5334473Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5334953Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 2. CUDA driver allocated memory was 2300575744 and is now 3105882112.
2025-12-04T13:21:31.5335068Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5335265Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5335621Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5335736Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5335966Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5336133Z [rank2]:E1204 13:21:27.342000 566055 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 2 with exit code: 10
2025-12-04T13:21:31.5336170Z dist init r=2, world=4
2025-12-04T13:21:31.5336309Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5336469Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5336756Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5336912Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5337207Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5337349Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5337623Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5337771Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5338059Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5338247Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5338522Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5338658Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5338935Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5339086Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5339567Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 0. CUDA driver allocated memory was 2453667840 and is now 3258974208.
2025-12-04T13:21:31.5339682Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5339877Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5340247Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5340360Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5340572Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5340735Z [rank0]:E1204 13:21:27.349000 566053 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 0 with exit code: 10
2025-12-04T13:21:31.5340774Z dist init r=0, world=4
2025-12-04T13:21:31.5340910Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] Caught exception: 
2025-12-04T13:21:31.5341071Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] Traceback (most recent call last):
2025-12-04T13:21:31.5341358Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5341523Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     getattr(self, test_name)()
2025-12-04T13:21:31.5341822Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5341944Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     fn()
2025-12-04T13:21:31.5342234Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5342382Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5342659Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5342806Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     method(*args, **kwargs)
2025-12-04T13:21:31.5343080Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5343217Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     with policy():
2025-12-04T13:21:31.5343494Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5343643Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     raise RuntimeError(msg)
2025-12-04T13:21:31.5344121Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 1. CUDA driver allocated memory was 2317352960 and is now 3122659328.
2025-12-04T13:21:31.5344236Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5344441Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5344797Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5344911Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] 
2025-12-04T13:21:31.5345122Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935] This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5345288Z [rank1]:E1204 13:21:27.351000 566054 site-packages/torch/testing/_internal/common_distributed.py:935]  exiting process 1 with exit code: 10
2025-12-04T13:21:31.5345325Z dist init r=1, world=4
2025-12-04T13:21:31.5345364Z FAILED [7.4123s] [100%]
2025-12-04T13:21:31.5345367Z 
2025-12-04T13:21:31.5345424Z =================================== FAILURES ===================================
2025-12-04T13:21:31.5345533Z __ TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda ___
2025-12-04T13:21:31.5345579Z Traceback (most recent call last):
2025-12-04T13:21:31.5345751Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 770, in wrapper
2025-12-04T13:21:31.5345796Z     self._join_processes(fn)
2025-12-04T13:21:31.5345967Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1039, in _join_processes
2025-12-04T13:21:31.5346021Z     self._check_return_codes(fn, elapsed_time)
2025-12-04T13:21:31.5346209Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 1079, in _check_return_codes
2025-12-04T13:21:31.5346253Z     raise RuntimeError(error)
2025-12-04T13:21:31.5346333Z RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5346379Z Traceback (most recent call last):
2025-12-04T13:21:31.5346540Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5346584Z     getattr(self, test_name)()
2025-12-04T13:21:31.5346742Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5346776Z     fn()
2025-12-04T13:21:31.5346926Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5346966Z     method(*args, **kwargs)
2025-12-04T13:21:31.5347119Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5347158Z     method(*args, **kwargs)
2025-12-04T13:21:31.5347309Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5347346Z     with policy():
2025-12-04T13:21:31.5347498Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5347540Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5347891Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5347895Z 
2025-12-04T13:21:31.5347970Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5348239Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5348241Z 
2025-12-04T13:21:31.5348342Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5348345Z 
2025-12-04T13:21:31.5348347Z 
2025-12-04T13:21:31.5348424Z ----------------------------- Captured stdout call -----------------------------
2025-12-04T13:21:31.5348512Z Process 3 terminated with exit code 10, terminating remaining processes.
2025-12-04T13:21:31.5348746Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-4d126ec424ab47b8.xml -
2025-12-04T13:21:31.5348805Z =========================== short test summary info ============================
2025-12-04T13:21:31.5349053Z FAILED [7.4123s] distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda - RuntimeError: Process 3 exited with error code 10 and exception:
2025-12-04T13:21:31.5349100Z Traceback (most recent call last):
2025-12-04T13:21:31.5349265Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 925, in run_test
2025-12-04T13:21:31.5349306Z     getattr(self, test_name)()
2025-12-04T13:21:31.5349478Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_distributed.py", line 772, in wrapper
2025-12-04T13:21:31.5349524Z     fn()
2025-12-04T13:21:31.5349675Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5349715Z     method(*args, **kwargs)
2025-12-04T13:21:31.5349866Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3329, in wrapper
2025-12-04T13:21:31.5349917Z     method(*args, **kwargs)
2025-12-04T13:21:31.5350066Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 3328, in wrapper
2025-12-04T13:21:31.5350103Z     with policy():
2025-12-04T13:21:31.5350254Z   File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/testing/_internal/common_utils.py", line 2705, in __exit__
2025-12-04T13:21:31.5350295Z     raise RuntimeError(msg)
2025-12-04T13:21:31.5350651Z RuntimeError: CUDA driver API confirmed a leak in __mp_main__.TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda! Caching allocator allocated memory was 512 and is now reported as 27136 on device 3. CUDA driver allocated memory was 2250244096 and is now 3055550464.
2025-12-04T13:21:31.5350654Z 
2025-12-04T13:21:31.5350729Z To execute this test, run the following from the base repo dir:
2025-12-04T13:21:31.5350960Z     PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TEST_CUDA_MEM_LEAK_CHECK=1 python test/distributed/fsdp/test_fsdp_core.py TestParamInitCUDA.test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5350963Z 
2025-12-04T13:21:31.5351050Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-12-04T13:21:31.5351113Z !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
2025-12-04T13:21:31.5351175Z ======================= 1 failed, 18 deselected in 7.55s =======================
2025-12-04T13:21:31.5351213Z Got exit code 1
2025-12-04T13:21:31.5351391Z FAILED CONSISTENTLY: test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda
2025-12-04T13:21:31.5351521Z Test failed consistently, continuing with the rest of the tests due to continue-through-error being set
2025-12-04T13:21:31.5351709Z Test results will be stored in test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-83a646cba36145fb.xml
2025-12-04T13:21:31.5351767Z ============================= test session starts ==============================
2025-12-04T13:21:31.5351879Z platform linux -- Python 3.12.5, pytest-7.3.2, pluggy-1.6.0 -- /opt/conda/envs/py_3.12/bin/python
2025-12-04T13:21:31.5351920Z cachedir: .pytest_cache
2025-12-04T13:21:31.5352086Z hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow]
2025-12-04T13:21:31.5352133Z rootdir: /var/lib/jenkins/pytorch
2025-12-04T13:21:31.5352174Z configfile: pytest.ini
2025-12-04T13:21:31.5352337Z plugins: hypothesis-6.56.4, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-14.0, subtests-0.13.1, xdist-3.3.1, xdoctest-1.3.0, typeguard-4.3.0
2025-12-04T13:21:31.5352412Z collecting ... collected 60 items / 19 deselected / 41 selected
2025-12-04T13:21:31.5352465Z stepcurrent: skipping 19 already run items.
2025-12-04T13:21:31.5352509Z Running 0 items in this shard
2025-12-04T13:21:31.5352511Z 
2025-12-04T13:21:31.5352743Z - generated xml file: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.fsdp.test_fsdp_core/distributed.fsdp.test_fsdp_core-83a646cba36145fb.xml -
2025-12-04T13:21:31.5352802Z ============================ 19 deselected in 0.01s ============================
2025-12-04T13:21:31.5356005Z The following tests failed consistently: ['test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_after_state_dict_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestHooksCUDA::test_pre_backward_hook_registration_cuda_first_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_optim_step_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_delayed_reduce_scatter_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_mixture_of_experts_with_delay_before_free_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_always_wrap_model_offload_true_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_offload_true_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_no_shard_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_nested_wrapped_model_single_iteration_mixed_precision_offload_false_none_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParityWithDDPCUDA::test_transformer_offload_false_shard_grad_op_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestNoGradCUDA::test_transformer_no_grad_mixed_precision_True_cuda', 'test/distributed/fsdp/test_fsdp_core.py::TestParamInitCUDA::test_param_change_after_init_mixed_precision_False_cuda']
2025-12-04T13:21:31.5356030Z 
2025-12-04T13:21:31.5356216Z FINISHED PRINTING LOG FILE of distributed/fsdp/test_fsdp_core 1/3 (test/test-reports/distributed.fsdp.test_fsdp_core_1.3_b5bdac945a318f3b_.log)
2025-12-04T13:21:31.5356218Z 
2025-12-04T13:21:31.5356341Z Finished distributed/fsdp/test_fsdp_core 1/3 ... [2025-12-04 13:21:31.306051][2293989.955230869], took 23.36min
2025-12-04T13:21:31.5356604Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:21:31.5356690Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:21:31.5356795Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T13:21:31.5356843Z Uploading artifacts took 0.00 seconds
2025-12-04T13:21:31.5356897Z distributed/fsdp/test_fsdp_core 1/3 failed!
2025-12-04T13:21:31.5357006Z Running distributed/test_c10d_spawn_gloo 1/1 ... [2025-12-04 13:21:31.310061][2293989.959244515]
2025-12-04T13:21:31.5357055Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T13:21:31.5357386Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_spawn_gloo.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:21:31.310253]
2025-12-04T13:22:30.7624246Z 
2025-12-04T13:22:30.7625268Z distributed/test_c10d_spawn_gloo 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_gloo_1.1_16b0e09937d5ce50_.log
2025-12-04T13:22:30.7629877Z Running 11 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cpu, test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cuda, test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_rnn, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_gather, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all_single, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_allreduce, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_broadcast, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_gather, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_reduce, test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_scatter
2025-12-04T13:22:30.7631760Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cpu
2025-12-04T13:22:30.7632123Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_cuda
2025-12-04T13:22:30.7632479Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::DistributedDataParallelSingleProcessTest::test_rnn
2025-12-04T13:22:30.7632822Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_gather
2025-12-04T13:22:30.7633152Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all
2025-12-04T13:22:30.7633493Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_all_to_all_single
2025-12-04T13:22:30.7633833Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_allreduce
2025-12-04T13:22:30.7634163Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_broadcast
2025-12-04T13:22:30.7634486Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_gather
2025-12-04T13:22:30.7634804Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_reduce
2025-12-04T13:22:30.7635123Z Running 1 items in this shard: test/distributed/test_c10d_spawn_gloo.py::TestDistributedNNFunctionsGloo::test_scatter
2025-12-04T13:22:30.7635302Z 
2025-12-04T13:22:30.7635425Z Finished distributed/test_c10d_spawn_gloo 1/1 ... [2025-12-04 13:22:30.762114][2294049.4112939], took 0.99min
2025-12-04T13:22:30.7639412Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:22:30.7655447Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:22:30.7658932Z Running distributed/test_c10d_spawn_ucc 1/1 ... [2025-12-04 13:22:30.765755][2294049.414937132]
2025-12-04T13:22:30.7659132Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T13:22:30.7660622Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_spawn_ucc.py', '--shard-id=1', '--num-shards=1', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:22:30.765958]
2025-12-04T13:22:44.7105997Z 
2025-12-04T13:22:44.7106850Z distributed/test_c10d_spawn_ucc 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_spawn_ucc_1.1_2d2963b015177e4d_.log
2025-12-04T13:22:44.7109248Z Running 6 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast, test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce
2025-12-04T13:22:44.7111001Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_gather
2025-12-04T13:22:44.7111575Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all
2025-12-04T13:22:44.7112216Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_all_to_all_single
2025-12-04T13:22:44.7112826Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_allreduce
2025-12-04T13:22:44.7113388Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_broadcast
2025-12-04T13:22:44.7113939Z Running 1 items in this shard: test/distributed/test_c10d_spawn_ucc.py::TestDistributedNNFunctionsUcc::test_reduce
2025-12-04T13:22:44.7114245Z 
2025-12-04T13:22:44.7114466Z Finished distributed/test_c10d_spawn_ucc 1/1 ... [2025-12-04 13:22:44.710382][2294063.359561743], took 0.23min
2025-12-04T13:22:44.7121927Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:22:44.7137968Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:22:44.7141345Z Running distributed/test_c10d_gloo 1/2 ... [2025-12-04 13:22:44.713984][2294063.363167496]
2025-12-04T13:22:44.7141589Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T13:22:44.7142852Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_gloo.py', '--shard-id=1', '--num-shards=2', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:22:44.714164]
2025-12-04T13:32:54.3360990Z 
2025-12-04T13:32:54.3362303Z distributed/test_c10d_gloo 1/2 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_gloo_1.2_a9670515dc54cf51_.log
2025-12-04T13:32:54.3387479Z Running 127 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousTCPTest::test_tcp_init, test/distributed/test_c10d_gloo.py::TimeoutTest::test_default_store_timeout_gloo, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_barrier_implies_wait, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_multi_device_constructor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_weight_sharing, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_cpu, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_gloo, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_sparse_gradients, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_sharded_tensor, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients_grad_is_view, test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input, test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward, test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_single_bucket, test/distributed/test_c10d_gloo.py::ReducerTest::test_single_dtype_single_bucket, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_into_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_empty_tensors, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_noncontiguous_input, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_async, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_inference_mode, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_op_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_overall_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_block_current_stream_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_checks, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor_coalesced, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_all_to_all, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_complex, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_set_gloo_pg_timeout, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_json, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics_cuda, test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_checks, test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cpu, test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cuda, test/distributed/test_c10d_gloo.py::CommTest::test_gloo_rank_membership, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_default_pg_gloo, test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_gloo_new_group, test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_complex, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allgather_coalesced, test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends, test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_duplicate_pg
2025-12-04T13:32:54.3403778Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::RendezvousTCPTest::test_tcp_init
2025-12-04T13:32:54.3404074Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::TimeoutTest::test_default_store_timeout_gloo
2025-12-04T13:32:54.3404392Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_async
2025-12-04T13:32:54.3404720Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_coalesced_checks
2025-12-04T13:32:54.3405049Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_into_tensor_coalesced
2025-12-04T13:32:54.3405372Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress
2025-12-04T13:32:54.3405675Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allgather_stress_cuda
2025-12-04T13:32:54.3405977Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics
2025-12-04T13:32:54.3406367Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_basics_cuda
2025-12-04T13:32:54.3406667Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_checks
2025-12-04T13:32:54.3406972Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_basics
2025-12-04T13:32:54.3407340Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_coalesced_checks_cuda
2025-12-04T13:32:54.3407666Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_overall_timeout
2025-12-04T13:32:54.3407968Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_allreduce_stress
2025-12-04T13:32:54.3408302Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_barrier_implies_wait
2025-12-04T13:32:54.3408603Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics
2025-12-04T13:32:54.3408923Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_basics_cuda
2025-12-04T13:32:54.3409240Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress
2025-12-04T13:32:54.3409538Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_broadcast_stress_cuda
2025-12-04T13:32:54.3409836Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_empty_tensors
2025-12-04T13:32:54.3410147Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_gather_stress_cuda
2025-12-04T13:32:54.3410453Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_multi_device_constructor
2025-12-04T13:32:54.3410758Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_basics
2025-12-04T13:32:54.3411046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress
2025-12-04T13:32:54.3411339Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_reduce_stress_cuda
2025-12-04T13:32:54.3411640Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics
2025-12-04T13:32:54.3411933Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_basics_cuda
2025-12-04T13:32:54.3412229Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_scatter_stress
2025-12-04T13:32:54.3412526Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_send_recv_all_to_all
2025-12-04T13:32:54.3412828Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_set_gloo_pg_timeout
2025-12-04T13:32:54.3413137Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics
2025-12-04T13:32:54.3413460Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooTest::test_sparse_allreduce_basics_cuda
2025-12-04T13:32:54.3413784Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_dataclass_output
2025-12-04T13:32:54.3414124Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module
2025-12-04T13:32:54.3414503Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_weight_sharing
2025-12-04T13:32:54.3414896Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False
2025-12-04T13:32:54.3415314Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_True
2025-12-04T13:32:54.3415723Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_static_graph_use_reentrant_True
2025-12-04T13:32:54.3416134Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_False
2025-12-04T13:32:54.3416526Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_twice_use_reentrant_True
2025-12-04T13:32:54.3416929Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_unused_params_use_reentrant_True
2025-12-04T13:32:54.3417346Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_True
2025-12-04T13:32:54.3417750Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_cpu
2025-12-04T13:32:54.3418122Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_gloo
2025-12-04T13:32:54.3418541Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ddp_comm_hook_sparse_gradients
2025-12-04T13:32:54.3418904Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad
2025-12-04T13:32:54.3419302Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_global_local_unused_params_grad_with_grad_is_view
2025-12-04T13:32:54.3419681Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module
2025-12-04T13:32:54.3420037Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_gloo_backend_cpu_module_grad_is_view
2025-12-04T13:32:54.3420385Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_output
2025-12-04T13:32:54.3420713Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_ignored_sharded_tensor
2025-12-04T13:32:54.3421037Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients
2025-12-04T13:32:54.3421372Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sparse_gradients_grad_is_view
2025-12-04T13:32:54.3421726Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::DistributedDataParallelTest::test_sync_batch_norm_empty_input
2025-12-04T13:32:54.3422041Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_forward_backward
2025-12-04T13:32:54.3422323Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_multi_dtype_single_bucket
2025-12-04T13:32:54.3422617Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ReducerTest::test_single_dtype_single_bucket
2025-12-04T13:32:54.3422925Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_checks
2025-12-04T13:32:54.3423260Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_coalesced_checks
2025-12-04T13:32:54.3423607Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_inference_mode
2025-12-04T13:32:54.3423960Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_into_tensor_coalesced
2025-12-04T13:32:54.3424317Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allgather_stress
2025-12-04T13:32:54.3424647Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_basics_cuda
2025-12-04T13:32:54.3424974Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_checks
2025-12-04T13:32:54.3425310Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_async
2025-12-04T13:32:54.3425659Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks
2025-12-04T13:32:54.3426017Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_checks_cuda
2025-12-04T13:32:54.3426376Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_coalesced_stress
2025-12-04T13:32:54.3426721Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_op_timeout
2025-12-04T13:32:54.3427079Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_overall_timeout
2025-12-04T13:32:54.3427429Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress
2025-12-04T13:32:54.3427754Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_allreduce_stress_cuda
2025-12-04T13:32:54.3428097Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_block_current_stream_cuda
2025-12-04T13:32:54.3428474Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_basics
2025-12-04T13:32:54.3428793Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress
2025-12-04T13:32:54.3429119Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_broadcast_stress_cuda
2025-12-04T13:32:54.3429446Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_empty_tensors
2025-12-04T13:32:54.3429760Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_basics
2025-12-04T13:32:54.3430070Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_checks
2025-12-04T13:32:54.3430402Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_noncontiguous_input
2025-12-04T13:32:54.3430741Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_gather_stress_cuda
2025-12-04T13:32:54.3431072Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor
2025-12-04T13:32:54.3431425Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_scatter_tensor_coalesced
2025-12-04T13:32:54.3431762Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_reduce_stress
2025-12-04T13:32:54.3432076Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_basics
2025-12-04T13:32:54.3432392Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_scatter_checks
2025-12-04T13:32:54.3432714Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_send_recv_all_to_all
2025-12-04T13:32:54.3433046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_set_gloo_pg_timeout
2025-12-04T13:32:54.3433397Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooLazyInitTest::test_sparse_allreduce_basics
2025-12-04T13:32:54.3433720Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics
2025-12-04T13:32:54.3434027Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_basics_cuda
2025-12-04T13:32:54.3434345Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_async
2025-12-04T13:32:54.3434671Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_coalesced_checks
2025-12-04T13:32:54.3434993Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_inference_mode
2025-12-04T13:32:54.3435303Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress
2025-12-04T13:32:54.3435610Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allgather_stress_cuda
2025-12-04T13:32:54.3435942Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_basics_cuda
2025-12-04T13:32:54.3436264Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_coalesced_stress
2025-12-04T13:32:54.3436598Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_op_timeout
2025-12-04T13:32:54.3436916Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_overall_timeout
2025-12-04T13:32:54.3437253Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_allreduce_stress
2025-12-04T13:32:54.3437566Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_block_current_stream_cuda
2025-12-04T13:32:54.3437877Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress
2025-12-04T13:32:54.3438216Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_broadcast_stress_cuda
2025-12-04T13:32:54.3438527Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_basics_cuda
2025-12-04T13:32:54.3438829Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_checks
2025-12-04T13:32:54.3439128Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_gather_stress_cuda
2025-12-04T13:32:54.3439428Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_checks
2025-12-04T13:32:54.3439725Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter
2025-12-04T13:32:54.3440046Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_reduce_scatter_tensor_coalesced
2025-12-04T13:32:54.3440377Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_basics_cuda
2025-12-04T13:32:54.3440688Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_scatter_stress_cuda
2025-12-04T13:32:54.3441001Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_all_to_all
2025-12-04T13:32:54.3441313Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_send_recv_complex
2025-12-04T13:32:54.3441617Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_set_gloo_pg_timeout
2025-12-04T13:32:54.3441916Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_short_json
2025-12-04T13:32:54.3442253Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_basics_cuda
2025-12-04T13:32:54.3442583Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::ProcessGroupGlooFRTest::test_sparse_allreduce_checks
2025-12-04T13:32:54.3442893Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cpu
2025-12-04T13:32:54.3443184Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_broadcast_coalesced_gloo_cuda
2025-12-04T13:32:54.3443467Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_gloo_rank_membership
2025-12-04T13:32:54.3443756Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_default_pg_gloo
2025-12-04T13:32:54.3444056Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_sequence_num_set_gloo_new_group
2025-12-04T13:32:54.3444344Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::CommTest::test_tensor_dtype_complex
2025-12-04T13:32:54.3444701Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_allgather_coalesced
2025-12-04T13:32:54.3445129Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::GlooProcessGroupWithDispatchedCollectivesTests::test_init_process_group_for_all_backends
2025-12-04T13:32:54.3445533Z Running 1 items in this shard: test/distributed/test_c10d_gloo.py::LargeCommTest::test_new_group_local_sync_duplicate_pg
2025-12-04T13:32:54.3445714Z 
2025-12-04T13:32:54.3445837Z Finished distributed/test_c10d_gloo 1/2 ... [2025-12-04 13:32:54.337459][2294672.98663913], took 10.16min
2025-12-04T13:32:54.3446277Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:32:54.3446677Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:32:54.3446925Z Running distributed/fsdp/test_fsdp_mixed_precision 1/1 ... [2025-12-04 13:32:54.341168][2294672.990352042]
2025-12-04T13:32:54.3447136Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T13:32:54.3447552Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/fsdp/test_fsdp_mixed_precision.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:32:54.341355]
2025-12-04T13:39:22.5545288Z 
2025-12-04T13:39:22.5546515Z distributed/fsdp/test_fsdp_mixed_precision 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.fsdp.test_fsdp_mixed_precision_1.1_2515bba5fc6f1639_.log
2025-12-04T13:39:22.5571628Z Running 66 items in this shard: test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_buffer_dtype_no_root_handle, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_eval_root_cast_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_buffers, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_full_precision_in_eval_comm, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_grads_reduced_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_input_grads_with_param_mixed_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_diff_buffer_reduce_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_fp16_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_no_mp_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_param_and_buf_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_false_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp32_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_enable_sharded_grad_scaler, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_e2e_full_shard_mp_only_reduce_offload_true_fp64_none, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_no_reshard_after_forward, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mixed_precision_resnet, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_False, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_batchnorm_convert_sync_bn_True, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_default, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_only_params_and_bufs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_params_and_reduce_diff, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionSharded::test_mp_embedding_reduce, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_grads_reduced_precision, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_mixed_precision_e2e_full_shard, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionUnsharded::test_mixed_precision_no_reshard_after_forward, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPMixedPrecisionIgnoredModules::test_mixed_precision_with_ignored_module, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule_skip_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_float16_on_one_submodule_skip_inputs_error, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_different_precisions_error, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPDifferentSubmodulePrecision::test_submodules_with_external_inputs, test/distributed/fsdp/test_fsdp_mixed_precision.py::TestFSDPTrainEval::test_train_ema_eval_flow
2025-12-04T13:39:22.5586187Z 
2025-12-04T13:39:22.5586333Z Finished distributed/fsdp/test_fsdp_mixed_precision 1/1 ... [2025-12-04 13:39:22.554498][2295061.203679365], took 6.47min
2025-12-04T13:39:22.5586781Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:39:22.5587169Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:39:22.5587392Z Running distributed/test_c10d_nccl 2/3 ... [2025-12-04 13:39:22.558020][2295061.20720422]
2025-12-04T13:39:22.5587608Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T13:39:22.5588019Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/test_c10d_nccl.py', '--shard-id=2', '--num-shards=3', '-v', '--subprocess', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:39:22.558192]
2025-12-04T13:49:13.6560390Z 
2025-12-04T13:49:13.6561016Z distributed/test_c10d_nccl 2/3 was successful, full logs can be found in artifacts with path test/test-reports/distributed.test_c10d_nccl_2.3_ef0a5ca71e33a7d5_.log
2025-12-04T13:49:13.6574082Z Running 83 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLInitTest::test_scalable_init, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_split_subgroup, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_cuda_event_cache_mthd_race, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_destruct_before_terminate_pg, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_deterministic_mode_no_break, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float16, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float64, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float8_e4m3fn, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_rank_filter, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend0, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_basic, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_multiple_iterations, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_arbitrary_forward_return_value, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_multi_device_module_config, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_weight_sharing, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_invalid_powerSGD_state, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_module_device_ids_None, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg, test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view, test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops, test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_nonblocking, test/distributed/test_c10d_nccl.py::NcclUserBufferRegistrationTest::test_nccl_window_registration, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_manager_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier_device_ids, test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_off, test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership, test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream, test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k, test/distributed/test_c10d_nccl.py::CommTest::test_unwaited, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_collectives, test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_default_process_group, test/distributed/test_c10d_nccl.py::LargeCommTest::test_batch_send_recv_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False, test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync, test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_object_list_subgroup_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_True, test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_circular_buffer_full_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_False_include_collectives_True, test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_False, test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_recursive_split_group
2025-12-04T13:49:13.6585534Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::TimeoutTest::test_default_store_timeout_nccl
2025-12-04T13:49:13.6585845Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLInitTest::test_scalable_init
2025-12-04T13:49:13.6586168Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_abort_in_destroy_pg
2025-12-04T13:49:13.6586499Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_comm_split_subgroup
2025-12-04T13:49:13.6586835Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_cuda_event_cache_mthd_race
2025-12-04T13:49:13.6587189Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_destruct_before_terminate_pg
2025-12-04T13:49:13.6587542Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_deterministic_mode_no_break
2025-12-04T13:49:13.6587897Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_extend_nccl_pg_timeout_backend0
2025-12-04T13:49:13.6588354Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float16
2025-12-04T13:49:13.6588677Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float64
2025-12-04T13:49:13.6589012Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_assert_float8_e4m3fn
2025-12-04T13:49:13.6589330Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_check
2025-12-04T13:49:13.6589634Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_nan_rank_filter
2025-12-04T13:49:13.6589962Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_new_group_eager_init_False
2025-12-04T13:49:13.6590293Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_non_blocking_p2p
2025-12-04T13:49:13.6590661Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_nccl_pg_timeout_backend0
2025-12-04T13:49:13.6591003Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_set_process_group_desc
2025-12-04T13:49:13.6591329Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_basic
2025-12-04T13:49:13.6591673Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_shrink_group_multiple_iterations
2025-12-04T13:49:13.6592029Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLGroupTest::test_subgroup_p2p_eager_init_True
2025-12-04T13:49:13.6592408Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_accumulate_gradients_module_with_grad_is_view
2025-12-04T13:49:13.6592796Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_arbitrary_forward_return_value
2025-12-04T13:49:13.6593181Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_bf16_compress_wrapper_nccl
2025-12-04T13:49:13.6593575Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_builtin_ddp_comm_hooks_nccl_grad_is_view
2025-12-04T13:49:13.6593956Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_dynamic_module
2025-12-04T13:49:13.6594340Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_once_use_reentrant_False
2025-12-04T13:49:13.6594768Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_checkpointing_weight_sharing_use_reentrant_False
2025-12-04T13:49:13.6595188Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_grad_is_view
2025-12-04T13:49:13.6595593Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_allreduce_hook_nccl_static_graph
2025-12-04T13:49:13.6595992Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_comm_hook_future_passing_gpu_nccl
2025-12-04T13:49:13.6596371Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_multi_device_module_config
2025-12-04T13:49:13.6596724Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_ddp_weight_sharing
2025-12-04T13:49:13.6597067Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_default_ddp_comm_hooks_nccl
2025-12-04T13:49:13.6597458Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_find_unused_parameters_kwarg_debug_detail
2025-12-04T13:49:13.6597830Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_grad_layout_2devicemodule
2025-12-04T13:49:13.6598225Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_invalid_powerSGD_state
2025-12-04T13:49:13.6598589Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_multiple_outputs_multiple_backward
2025-12-04T13:49:13.6598982Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_integer_list
2025-12-04T13:49:13.6599403Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_1gpu_module_device_ids_torch_device_list
2025-12-04T13:49:13.6599845Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_nccl_backend_multi_device_module_device_ids_None
2025-12-04T13:49:13.6600231Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_pass_default_pg
2025-12-04T13:49:13.6600588Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::DistributedDataParallelTest::test_powerSGD_ddp_comm_hook_nccl_grad_is_view
2025-12-04T13:49:13.6600942Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::WorkHookTest::test_on_completion_hook_mixed_ops
2025-12-04T13:49:13.6601257Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclErrorHandlingTest::test_nccl_errors_nonblocking
2025-12-04T13:49:13.6601595Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclUserBufferRegistrationTest::test_nccl_window_registration
2025-12-04T13:49:13.6601931Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_manager_nccl
2025-12-04T13:49:13.6602233Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_all_reduce_coalesced_nccl
2025-12-04T13:49:13.6602553Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_broadcast_coalesced_nccl
2025-12-04T13:49:13.6602829Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier
2025-12-04T13:49:13.6603118Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_barrier_device_ids
2025-12-04T13:49:13.6603411Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nccl_warn_not_in_group_debug_off
2025-12-04T13:49:13.6603703Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_nncl_rank_membership
2025-12-04T13:49:13.6604023Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_pass_nccl_options_high_priority_stream
2025-12-04T13:49:13.6604325Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_reduce_scatter_base_k
2025-12-04T13:49:13.6604592Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::CommTest::test_unwaited
2025-12-04T13:49:13.6604917Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_collectives
2025-12-04T13:49:13.6605317Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NcclProcessGroupWithDispatchedCollectivesTests::test_default_process_group
2025-12-04T13:49:13.6605700Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_batch_send_recv_subgroup_group_rank_False
2025-12-04T13:49:13.6606072Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_object_list_subgroup_set_device0_group_rank_True
2025-12-04T13:49:13.6606437Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_False
2025-12-04T13:49:13.6606763Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_broadcast_subgroup_group_rank_True
2025-12-04T13:49:13.6607094Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_object_subgroup_group_rank_False
2025-12-04T13:49:13.6607422Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_gather_subgroup_group_rank_False
2025-12-04T13:49:13.6607728Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_new_group_local_sync
2025-12-04T13:49:13.6608048Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_scatter_object_list_subgroup_group_rank_True
2025-12-04T13:49:13.6608476Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device0_group_rank_True
2025-12-04T13:49:13.6608870Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_object_list_subgroup_set_device1_group_rank_True
2025-12-04T13:49:13.6609273Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::LargeCommTest::test_send_recv_subgroup_group_rank_True_async_op_True
2025-12-04T13:49:13.6609677Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce0_timing_enabled_False
2025-12-04T13:49:13.6610079Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_False
2025-12-04T13:49:13.6610472Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_batched_send_recv_op_sizes_per_coalesce1_timing_enabled_True
2025-12-04T13:49:13.6610848Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_multiple_resets_timing_enabled_True
2025-12-04T13:49:13.6611350Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_circular_buffer_full_timing_enabled_True
2025-12-04T13:49:13.6611709Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_timing_enabled_False
2025-12-04T13:49:13.6612067Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_False
2025-12-04T13:49:13.6612437Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_fr_record_reset_wraparound_timing_enabled_True
2025-12-04T13:49:13.6612801Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_False
2025-12-04T13:49:13.6613175Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_individual_send_recv_op_sizes0_timing_enabled_True
2025-12-04T13:49:13.6613562Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_False_include_collectives_True
2025-12-04T13:49:13.6613943Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_json_timing_enabled_True_include_collectives_False
2025-12-04T13:49:13.6614329Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_short_pickle_timing_enabled_False_include_collectives_True
2025-12-04T13:49:13.6614712Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::NCCLTraceTest::test_trace_while_active_timing_enabled_True_only_active_False
2025-12-04T13:49:13.6615088Z Running 1 items in this shard: test/distributed/test_c10d_nccl.py::ProcessGroupNCCLLargerScaleTest::test_comm_recursive_split_group
2025-12-04T13:49:13.6615291Z 
2025-12-04T13:49:13.6615408Z Finished distributed/test_c10d_nccl 2/3 ... [2025-12-04 13:49:13.656413][2295652.30559399], took 9.85min
2025-12-04T13:49:13.6615825Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:49:13.6616228Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:49:13.6616451Z GITHUB_RUN_ID, GITHUB_RUN_ATTEMPT, or ARTIFACTS_FILE_SUFFIX not set, not uploading
2025-12-04T13:49:13.6616634Z Uploading artifacts took 0.00 seconds
2025-12-04T13:49:13.6616834Z Running distributed/elastic/timer/api_test 1/1 ... [2025-12-04 13:49:13.659978][2295652.309162234]
2025-12-04T13:49:13.6617037Z SCRIBE_GRAPHQL_ACCESS_TOKEN is NOT set
2025-12-04T13:49:13.6617449Z Executing ['/opt/conda/envs/py_3.12/bin/python', '-bb', 'distributed/elastic/timer/api_test.py', '--shard-id=1', '--num-shards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '-x', '--reruns=0', '--import-slow-tests', '--import-disabled-tests'] ... [2025-12-04 13:49:13.660138]
2025-12-04T13:49:14.5999329Z 
2025-12-04T13:49:14.6000704Z distributed/elastic/timer/api_test 1/1 was successful, full logs can be found in artifacts with path test/test-reports/distributed.elastic.timer.api_test_1.1_86547a72b69ce307_.log
2025-12-04T13:49:14.6001219Z 
2025-12-04T13:49:14.6001451Z Finished distributed/elastic/timer/api_test 1/1 ... [2025-12-04 13:49:14.599542][2295653.248720519], took 0.02min
2025-12-04T13:49:14.6024004Z Parsing testcases for test report: /var/lib/jenkins/pytorch/test/test-reports/python-pytest/distributed.test_dynamo_distributed/distributed.test_dynamo_distributed-80ae7d871d4f83c4.xml
2025-12-04T13:49:14.6040723Z Failed to parse and upload json test reports: Unable to locate credentials
2025-12-04T13:49:16.8464457Z Running test batch 'tests to run' cost 9398.63 seconds
2025-12-04T13:49:16.8465722Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:16.8469040Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856156_06970a90d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8626721Z /var/lib/jenkins/pytorch/tools/stats/upload_metrics.py:156: UserWarning: Error uploading metric td_test_failure_stats_v2 to DynamoDB: Unable to locate credentials
2025-12-04T13:49:18.8627784Z   warn(f"Error uploading metric {metric_name} to DynamoDB: {e}")
2025-12-04T13:49:18.8628292Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8630588Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07caa458d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8644906Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8645410Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cae7f6d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8661428Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8661901Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cb29a0d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8677859Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8678373Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cb6a6ed11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8693825Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8694272Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cba8e4d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8710411Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8710840Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cbea0cd11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8726667Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8727222Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cc2a76d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8742502Z Emitting td_test_failure_stats_v2
2025-12-04T13:49:18.8744690Z Writing 1 documents to S3 ossci-raw-job-status/ossci_uploaded_metrics/td_test_failure_stats_v2_1764856158_07cc6914d11811f0936eb632a3fcafd1
2025-12-04T13:49:18.8759040Z distributed/fsdp/test_fsdp_uneven 1/1 failed!
2025-12-04T13:49:18.8759267Z distributed/fsdp/test_fsdp_exec_order 1/1 failed!
2025-12-04T13:49:18.8759475Z distributed/fsdp/test_fsdp_traversal 1/1 failed!
2025-12-04T13:49:18.8759700Z distributed/fsdp/test_fsdp_multiple_wrapping 1/1 failed!
2025-12-04T13:49:18.8759913Z distributed/fsdp/test_fsdp_checkpoint 1/1 failed!
2025-12-04T13:49:18.8760111Z distributed/fsdp/test_fsdp_fine_tune 1/1 failed!
2025-12-04T13:49:18.8760319Z distributed/fsdp/test_fsdp_dtensor_state_dict 1/1 failed!
2025-12-04T13:49:18.8760522Z distributed/fsdp/test_fsdp_comm 1/1 failed!
2025-12-04T13:49:18.8760694Z distributed/fsdp/test_fsdp_core 1/3 failed!
2025-12-04T13:49:19.5289529Z 
2025-12-04T13:49:19.5290347Z real	156m44.385s
2025-12-04T13:49:19.5290636Z user	432m13.912s
2025-12-04T13:49:19.5290856Z sys	515m25.252s
2025-12-04T13:49:19.5291021Z + sccache_epilogue
2025-12-04T13:49:19.5291256Z + echo '::group::Sccache Compilation Log'
2025-12-04T13:49:19.5292009Z ##[group]Sccache Compilation Log
2025-12-04T13:49:19.5292798Z + echo '=================== sccache compilation log ==================='
2025-12-04T13:49:19.5293094Z =================== sccache compilation log ===================
2025-12-04T13:49:19.5293524Z + python /var/lib/jenkins/pytorch/.ci/pytorch/print_sccache_log.py /var/lib/jenkins/sccache_error.log
2025-12-04T13:49:19.5370038Z + echo '=========== If your build fails, please take a look at the log above for possible reasons ==========='
2025-12-04T13:49:19.5370435Z =========== If your build fails, please take a look at the log above for possible reasons ===========
2025-12-04T13:49:19.5370734Z + sccache --show-stats
2025-12-04T13:49:19.5391826Z Compile requests                    687
2025-12-04T13:49:19.5392036Z Compile requests executed             0
2025-12-04T13:49:19.5392227Z Cache hits                            0
2025-12-04T13:49:19.5392398Z Cache misses                          0
2025-12-04T13:49:19.5392574Z Cache hits rate                       -
2025-12-04T13:49:19.5392753Z Cache timeouts                        0
2025-12-04T13:49:19.5392934Z Cache read errors                     0
2025-12-04T13:49:19.5393112Z Forced recaches                       0
2025-12-04T13:49:19.5393282Z Cache write errors                    0
2025-12-04T13:49:19.5393547Z Cache errors                          0
2025-12-04T13:49:19.5393724Z Compilations                          0
2025-12-04T13:49:19.5393960Z Compilation failures                  0
2025-12-04T13:49:19.5394140Z Non-cacheable compilations            0
2025-12-04T13:49:19.5394332Z Non-cacheable calls                   1
2025-12-04T13:49:19.5394506Z Non-compilation calls               686
2025-12-04T13:49:19.5394695Z Unsupported compiler calls            0
2025-12-04T13:49:19.5394885Z Average cache write               0.000 s
2025-12-04T13:49:19.5395131Z Average compiler                  0.000 s
2025-12-04T13:49:19.5395318Z Average cache read hit            0.000 s
2025-12-04T13:49:19.5395518Z Failed distributed compilations       0
2025-12-04T13:49:19.5395645Z 
2025-12-04T13:49:19.5395713Z Non-cacheable reasons:
2025-12-04T13:49:19.5395874Z -E                                    1
2025-12-04T13:49:19.5395987Z 
2025-12-04T13:49:19.5396109Z Cache location                  Local disk: "/var/lib/jenkins/.cache/sccache"
2025-12-04T13:49:19.5396350Z Use direct/preprocessor mode?   yes
2025-12-04T13:49:19.5396543Z Version (client)                0.10.0
2025-12-04T13:49:19.5396729Z Max cache size                       10 GiB
2025-12-04T13:49:19.5396907Z + sccache --stop-server
2025-12-04T13:49:19.5402571Z Stopping sccache server...
2025-12-04T13:49:19.5404429Z Compile requests                    687
2025-12-04T13:49:19.5404812Z Compile requests executed             0
2025-12-04T13:49:19.5405105Z Cache hits                            0
2025-12-04T13:49:19.5405387Z Cache misses                          0
2025-12-04T13:49:19.5405676Z Cache hits rate                       -
2025-12-04T13:49:19.5405957Z Cache timeouts                        0
2025-12-04T13:49:19.5406232Z Cache read errors                     0
2025-12-04T13:49:19.5406504Z Forced recaches                       0
2025-12-04T13:49:19.5406771Z Cache write errors                    0
2025-12-04T13:49:19.5407042Z Cache errors                          0
2025-12-04T13:49:19.5407316Z Compilations                          0
2025-12-04T13:49:19.5407614Z Compilation failures                  0
2025-12-04T13:49:19.5407893Z Non-cacheable compilations            0
2025-12-04T13:49:19.5408326Z Non-cacheable calls                   1
2025-12-04T13:49:19.5408600Z Non-compilation calls               686
2025-12-04T13:49:19.5408871Z Unsupported compiler calls            0
2025-12-04T13:49:19.5409160Z Average cache write               0.000 s
2025-12-04T13:49:19.5409455Z Average compiler                  0.000 s
2025-12-04T13:49:19.5409737Z Average cache read hit            0.000 s
2025-12-04T13:49:19.5410025Z Failed distributed compilations       0
2025-12-04T13:49:19.5410213Z 
2025-12-04T13:49:19.5410313Z Non-cacheable reasons:
2025-12-04T13:49:19.5410556Z -E                                    1
2025-12-04T13:49:19.5410964Z 
2025-12-04T13:49:19.5411152Z Cache location                  Local disk: "/var/lib/jenkins/.cache/sccache"
2025-12-04T13:49:19.5411533Z Use direct/preprocessor mode?   yes
2025-12-04T13:49:19.5411822Z Version (client)                0.10.0
2025-12-04T13:49:19.5412103Z Max cache size                       10 GiB
2025-12-04T13:49:19.5412394Z + echo ::endgroup::
2025-12-04T13:49:19.5412914Z ##[endgroup]
2025-12-04T13:49:19.5464486Z ##[error]Process completed with exit code 1.
2025-12-04T13:49:19.5493172Z ##[group]Run # copy test results back to the mounted workspace, needed sudo, resulting permissions were correct
2025-12-04T13:49:19.5504608Z [36;1m# copy test results back to the mounted workspace, needed sudo, resulting permissions were correct[0m
2025-12-04T13:49:19.5505086Z [36;1mdocker exec -t "5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d" sh -c "cd ../pytorch && sudo cp -R test/test-reports ../workspace/test"[0m
2025-12-04T13:49:19.5510177Z shell: /usr/bin/bash -e {0}
2025-12-04T13:49:19.5510299Z env:
2025-12-04T13:49:19.5510413Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:19.5510556Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:19.5510743Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:19.5511020Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:19.5511558Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:19.5512128Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:19.5512247Z   AWS_REGION: us-east-1
2025-12-04T13:49:19.5512496Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:19.5512648Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:19.5514657Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:19.5514832Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:19.5515025Z ##[endgroup]
2025-12-04T13:49:19.6342435Z ##[group]Run docker exec -t "5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d" sh -c "sudo chown -R 1001:1001 test"
2025-12-04T13:49:19.6342910Z [36;1mdocker exec -t "5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d" sh -c "sudo chown -R 1001:1001 test"[0m
2025-12-04T13:49:19.6347140Z shell: /usr/bin/bash -e {0}
2025-12-04T13:49:19.6347264Z env:
2025-12-04T13:49:19.6347365Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:19.6347509Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:19.6347693Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:19.6347880Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:19.6348575Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:19.6349077Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:19.6349202Z   AWS_REGION: us-east-1
2025-12-04T13:49:19.6349392Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:19.6349553Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:19.6351573Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:19.6351751Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:19.6351944Z ##[endgroup]
2025-12-04T13:49:19.7207636Z ##[group]Run cat test/**/*_toprint.log || true
2025-12-04T13:49:19.7207813Z [36;1mcat test/**/*_toprint.log || true[0m
2025-12-04T13:49:19.7210772Z shell: /usr/bin/bash --noprofile --norc -e -o pipefail {0}
2025-12-04T13:49:19.7210932Z env:
2025-12-04T13:49:19.7211040Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:19.7211188Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:19.7211379Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:19.7211560Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:19.7212097Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:19.7212627Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:19.7212772Z   AWS_REGION: us-east-1
2025-12-04T13:49:19.7212921Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:19.7213088Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:19.7215138Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:19.7215317Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:19.7215506Z ##[endgroup]
2025-12-04T13:49:19.7259951Z cat: 'test/**/*_toprint.log': No such file or directory
2025-12-04T13:49:19.7328629Z Prepare all required actions
2025-12-04T13:49:19.7329015Z Getting action download info
2025-12-04T13:49:20.1267717Z Download action repository 'seemethere/upload-artifact-s3@v5' (SHA:baba72d0712b404f646cebe0730933554ebce96a)
2025-12-04T13:49:20.9538045Z Download action repository 'actions/upload-artifact@v4' (SHA:ea165f8d65b6e75b540449e92b4886f43607fa02)
2025-12-04T13:49:21.8795944Z ##[group]Run ./.github/actions/upload-test-artifacts
2025-12-04T13:49:21.8796103Z with:
2025-12-04T13:49:21.8796198Z   use-gha: true
2025-12-04T13:49:21.8796362Z   file-suffix: test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540
2025-12-04T13:49:21.8796546Z   s3-bucket: gha-artifacts
2025-12-04T13:49:21.8796739Z env:
2025-12-04T13:49:21.8796837Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:21.8796976Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:21.8797159Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:21.8797349Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:21.8797862Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:21.8798422Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:21.8798544Z   AWS_REGION: us-east-1
2025-12-04T13:49:21.8798707Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:21.8798865Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:21.8800870Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:21.8801047Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:21.8801237Z ##[endgroup]
2025-12-04T13:49:21.8830940Z ##[group]Run actions/upload-artifact@v4
2025-12-04T13:49:21.8831075Z with:
2025-12-04T13:49:21.8831275Z   name: test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip
2025-12-04T13:49:21.8831493Z   retention-days: 14
2025-12-04T13:49:21.8831607Z   if-no-files-found: warn
2025-12-04T13:49:21.8831719Z   path: test/**/*.json
2025-12-04T13:49:21.8831829Z   compression-level: 6
2025-12-04T13:49:21.8831933Z   overwrite: false
2025-12-04T13:49:21.8832044Z   include-hidden-files: false
2025-12-04T13:49:21.8832155Z env:
2025-12-04T13:49:21.8832248Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:21.8832388Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:21.8832573Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:21.8832742Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:21.8833251Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:21.8833740Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:21.8833863Z   AWS_REGION: us-east-1
2025-12-04T13:49:21.8833998Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:21.8834151Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:21.8836213Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:21.8836387Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:21.8836572Z ##[endgroup]
2025-12-04T13:49:22.3018588Z With the provided path, there will be 6 files uploaded
2025-12-04T13:49:22.3021164Z Artifact name is valid!
2025-12-04T13:49:22.3022049Z Root directory input is valid!
2025-12-04T13:49:22.5822774Z Beginning upload of artifact content to blob storage
2025-12-04T13:49:22.9570981Z Uploaded bytes 44615
2025-12-04T13:49:23.0253320Z Finished uploading artifact content to blob storage!
2025-12-04T13:49:23.0254560Z SHA256 digest of uploaded artifact zip is 522cfd5f062ae50bd9823d80787cbd4928b98ba8f996043c0a02d5a3c891ba7b
2025-12-04T13:49:23.0255468Z Finalizing artifact upload
2025-12-04T13:49:23.1824938Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip.zip successfully finalized. Artifact ID 4764717137
2025-12-04T13:49:23.1826381Z Artifact test-jsons-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip has been successfully uploaded! Final size is 44615 bytes. Artifact ID is 4764717137
2025-12-04T13:49:23.1830498Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922798714/artifacts/4764717137
2025-12-04T13:49:23.1955133Z ##[group]Run actions/upload-artifact@v4
2025-12-04T13:49:23.1955291Z with:
2025-12-04T13:49:23.1955501Z   name: test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip
2025-12-04T13:49:23.1955836Z   retention-days: 14
2025-12-04T13:49:23.1955951Z   if-no-files-found: ignore
2025-12-04T13:49:23.1956082Z   path: test/**/*.xml
test/**/*.csv

2025-12-04T13:49:23.1956212Z   compression-level: 6
2025-12-04T13:49:23.1956336Z   overwrite: false
2025-12-04T13:49:23.1956452Z   include-hidden-files: false
2025-12-04T13:49:23.1956569Z env:
2025-12-04T13:49:23.1956669Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:23.1956814Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:23.1957012Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:23.1957189Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:23.1957709Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:23.1958391Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:23.1958514Z   AWS_REGION: us-east-1
2025-12-04T13:49:23.1958698Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:23.1958857Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:23.1960888Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:23.1961069Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:23.1961259Z ##[endgroup]
2025-12-04T13:49:23.6035824Z With the provided path, there will be 808 files uploaded
2025-12-04T13:49:23.6039138Z Artifact name is valid!
2025-12-04T13:49:23.6039415Z Root directory input is valid!
2025-12-04T13:49:23.8241770Z Beginning upload of artifact content to blob storage
2025-12-04T13:49:24.5930445Z Uploaded bytes 681492
2025-12-04T13:49:24.6580145Z Finished uploading artifact content to blob storage!
2025-12-04T13:49:24.6582677Z SHA256 digest of uploaded artifact zip is 231eb3f54fc2665f1723cd26e833c8a548e4a409a21546d5f01010862e8d7fa5
2025-12-04T13:49:24.6583301Z Finalizing artifact upload
2025-12-04T13:49:24.8169117Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip.zip successfully finalized. Artifact ID 4764717455
2025-12-04T13:49:24.8170662Z Artifact test-reports-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip has been successfully uploaded! Final size is 681492 bytes. Artifact ID is 4764717455
2025-12-04T13:49:24.8175730Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922798714/artifacts/4764717455
2025-12-04T13:49:24.8329444Z ##[group]Run actions/upload-artifact@v4
2025-12-04T13:49:24.8329593Z with:
2025-12-04T13:49:24.8329784Z   name: logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip
2025-12-04T13:49:24.8329998Z   retention-days: 14
2025-12-04T13:49:24.8330116Z   if-no-files-found: ignore
2025-12-04T13:49:24.8330244Z   path: usage_log.txt
test/**/*.log

2025-12-04T13:49:24.8330387Z   compression-level: 6
2025-12-04T13:49:24.8330498Z   overwrite: false
2025-12-04T13:49:24.8330610Z   include-hidden-files: false
2025-12-04T13:49:24.8330729Z env:
2025-12-04T13:49:24.8330824Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:24.8330969Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:24.8331323Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:24.8331500Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:24.8332024Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:24.8332583Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:24.8332713Z   AWS_REGION: us-east-1
2025-12-04T13:49:24.8332883Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:24.8333097Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:24.8335162Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:24.8335343Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:24.8335534Z ##[endgroup]
2025-12-04T13:49:25.2826191Z Multiple search paths detected. Calculating the least common ancestor of all paths
2025-12-04T13:49:25.2827282Z The least common ancestor is /home/runner/_work/pytorch/pytorch. This will be the root directory of the artifact
2025-12-04T13:49:25.2827594Z With the provided path, there will be 84 files uploaded
2025-12-04T13:49:25.2830552Z Artifact name is valid!
2025-12-04T13:49:25.2831232Z Root directory input is valid!
2025-12-04T13:49:25.5111496Z Beginning upload of artifact content to blob storage
2025-12-04T13:49:26.0655357Z Uploaded bytes 395449
2025-12-04T13:49:26.1332648Z Finished uploading artifact content to blob storage!
2025-12-04T13:49:26.1333946Z SHA256 digest of uploaded artifact zip is b29fd9d0f808ab53863051eb5997c3790697a39efb0251c741829f3455d61657
2025-12-04T13:49:26.1334892Z Finalizing artifact upload
2025-12-04T13:49:26.2871280Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip.zip successfully finalized. Artifact ID 4764717750
2025-12-04T13:49:26.2872707Z Artifact logs-runattempt1-test-distributed-2-3-linux.rocm.gpu.gfx942.4.b_57117547540.zip has been successfully uploaded! Final size is 395449 bytes. Artifact ID is 4764717750
2025-12-04T13:49:26.2876559Z Artifact download URL: https://github.com/pytorch/pytorch/actions/runs/19922798714/artifacts/4764717750
2025-12-04T13:49:26.3020874Z ##[group]Run # shellcheck disable=SC2156
2025-12-04T13:49:26.3021111Z [36;1m# shellcheck disable=SC2156[0m
2025-12-04T13:49:26.3021413Z [36;1mfind . -iname "core.[1-9]*" -exec docker exec "${CONTAINER_NAME}" sh -c "gdb python {} -ex 'bt' -ex 'q'" \;[0m
2025-12-04T13:49:26.3026009Z shell: /usr/bin/bash -e {0}
2025-12-04T13:49:26.3026189Z env:
2025-12-04T13:49:26.3037183Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:26.3037368Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:26.3037568Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:26.3037746Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:26.3038550Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:26.3039059Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:26.3039193Z   AWS_REGION: us-east-1
2025-12-04T13:49:26.3039380Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:26.3039551Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:26.3041566Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:26.3041752Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:26.3041952Z ##[endgroup]
2025-12-04T13:49:26.4389640Z ##[group]Run actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
2025-12-04T13:49:26.4389840Z with:
2025-12-04T13:49:26.4389988Z   name: coredumps-distributed-2-3-linux.rocm.gpu.gfx942.4.b
2025-12-04T13:49:26.4390163Z   retention-days: 14
2025-12-04T13:49:26.4390278Z   if-no-files-found: ignore
2025-12-04T13:49:26.4390402Z   path: ./**/core.[1-9]*
2025-12-04T13:49:26.4390522Z   compression-level: 6
2025-12-04T13:49:26.4390632Z   overwrite: false
2025-12-04T13:49:26.4390747Z   include-hidden-files: false
2025-12-04T13:49:26.4390868Z env:
2025-12-04T13:49:26.4391052Z   GIT_DEFAULT_BRANCH: main
2025-12-04T13:49:26.4391201Z   RUNNER_ARTIFACT_DIR: /home/runner/_work/_temp/artifacts
2025-12-04T13:49:26.4391395Z   RUNNER_TEST_RESULTS_DIR: /home/runner/_work/_temp/test-results
2025-12-04T13:49:26.4391572Z   RUNNER_DOCS_DIR: /home/runner/_work/_temp/docs
2025-12-04T13:49:26.4392114Z   GPU_FLAG: --device=/dev/mem --device=/dev/kfd --group-add 110 --device /dev/dri/renderD160 --device /dev/dri/renderD168 --device /dev/dri/renderD176 --device /dev/dri/renderD184 --group-add video --group-add 109 --group-add daemon --group-add bin --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --network=host
2025-12-04T13:49:26.4392690Z   AWS_DEFAULT_REGION: us-east-1
2025-12-04T13:49:26.4392820Z   AWS_REGION: us-east-1
2025-12-04T13:49:26.4392998Z   AWS_ACCESS_KEY_ID: ***
2025-12-04T13:49:26.4393170Z   AWS_SECRET_ACCESS_KEY: ***
2025-12-04T13:49:26.4395224Z   AWS_SESSION_TOKEN: ***
2025-12-04T13:49:26.4395413Z   CONTAINER_NAME: 5d33cd4909ac1c147401856f4c94ba1b47e15bde8a8d3fccefb188f5b658e86d
2025-12-04T13:49:26.4395614Z ##[endgroup]
2025-12-04T13:49:29.8994054Z No files were found with the provided path: ./**/core.[1-9]*. No artifacts will be uploaded.
2025-12-04T13:49:29.9154850Z Post job cleanup.
2025-12-04T13:49:29.9167896Z Post job cleanup.
2025-12-04T13:49:29.9372905Z Logging out of registry 308535385114.dkr.ecr.us-east-1.amazonaws.com
2025-12-04T13:49:29.9593722Z Post job cleanup.
2025-12-04T13:49:30.0259082Z Post job cleanup.
2025-12-04T13:49:30.0290231Z Post job cleanup.
2025-12-04T13:49:30.0755018Z [command]/usr/bin/git version
2025-12-04T13:49:30.0781385Z git version 2.52.0
2025-12-04T13:49:30.0805407Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/10fbb372-00e7-4b77-8d9d-8659fcf59d40/.gitconfig'
2025-12-04T13:49:30.0811652Z Temporarily overriding HOME='/home/runner/_work/_temp/10fbb372-00e7-4b77-8d9d-8659fcf59d40' before making global git config changes
2025-12-04T13:49:30.0812007Z Adding repository directory to the temporary git global config as a safe directory
2025-12-04T13:49:30.0814224Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch
2025-12-04T13:49:30.0842028Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
2025-12-04T13:49:30.0865675Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
2025-12-04T13:49:30.1088060Z Entering 'android/libs/fbjni'
2025-12-04T13:49:30.1127411Z Entering 'third_party/FP16'
2025-12-04T13:49:30.1155215Z Entering 'third_party/FXdiv'
2025-12-04T13:49:30.1181340Z Entering 'third_party/NNPACK'
2025-12-04T13:49:30.1214462Z Entering 'third_party/NVTX'
2025-12-04T13:49:30.1256051Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T13:49:30.1285454Z Entering 'third_party/XNNPACK'
2025-12-04T13:49:30.1326092Z Entering 'third_party/aiter'
2025-12-04T13:49:30.1355726Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T13:49:30.1387422Z Entering 'third_party/benchmark'
2025-12-04T13:49:30.1411470Z Entering 'third_party/composable_kernel'
2025-12-04T13:49:30.1441453Z Entering 'third_party/cpp-httplib'
2025-12-04T13:49:30.1466007Z Entering 'third_party/cpuinfo'
2025-12-04T13:49:30.1491907Z Entering 'third_party/cudnn_frontend'
2025-12-04T13:49:30.1520183Z Entering 'third_party/cutlass'
2025-12-04T13:49:30.1548883Z Entering 'third_party/fbgemm'
2025-12-04T13:49:30.1582881Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T13:49:30.1612750Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T13:49:30.1640764Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T13:49:30.1665615Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T13:49:30.1693823Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T13:49:30.1722345Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T13:49:30.1746167Z Entering 'third_party/fbgemm/external/json'
2025-12-04T13:49:30.1775426Z Entering 'third_party/flash-attention'
2025-12-04T13:49:30.1806253Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T13:49:30.1832252Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T13:49:30.1860323Z Entering 'third_party/flatbuffers'
2025-12-04T13:49:30.1886809Z Entering 'third_party/fmt'
2025-12-04T13:49:30.1913403Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T13:49:30.1938526Z Entering 'third_party/gloo'
2025-12-04T13:49:30.1963794Z Entering 'third_party/googletest'
2025-12-04T13:49:30.1995769Z Entering 'third_party/ideep'
2025-12-04T13:49:30.2021633Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T13:49:30.2067129Z Entering 'third_party/ittapi'
2025-12-04T13:49:30.2092911Z Entering 'third_party/kineto'
2025-12-04T13:49:30.2123396Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T13:49:30.2151018Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T13:49:30.2178999Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T13:49:30.2206768Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T13:49:30.2230297Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T13:49:30.2258695Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T13:49:30.2289889Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T13:49:30.2313941Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T13:49:30.2336140Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T13:49:30.2360328Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T13:49:30.2383570Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T13:49:30.2407303Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:30.2443387Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:30.2471722Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T13:49:30.2496450Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T13:49:30.2522966Z Entering 'third_party/kleidiai'
2025-12-04T13:49:30.2550531Z Entering 'third_party/mimalloc'
2025-12-04T13:49:30.2573623Z Entering 'third_party/nlohmann'
2025-12-04T13:49:30.2600826Z Entering 'third_party/onnx'
2025-12-04T13:49:30.2631923Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T13:49:30.2670112Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T13:49:30.2695063Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T13:49:30.2725613Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T13:49:30.2759380Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T13:49:30.2782429Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T13:49:30.2817427Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T13:49:30.2840081Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T13:49:30.2864904Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T13:49:30.2888614Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:30.2920145Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:30.2945725Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T13:49:30.2977988Z Entering 'third_party/pocketfft'
2025-12-04T13:49:30.3002196Z Entering 'third_party/protobuf'
2025-12-04T13:49:30.3028292Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T13:49:30.3054312Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T13:49:30.3084579Z Entering 'third_party/psimd'
2025-12-04T13:49:30.3115553Z Entering 'third_party/pthreadpool'
2025-12-04T13:49:30.3139423Z Entering 'third_party/pybind11'
2025-12-04T13:49:30.3164980Z Entering 'third_party/python-peachpy'
2025-12-04T13:49:30.3189376Z Entering 'third_party/sleef'
2025-12-04T13:49:30.3216490Z Entering 'third_party/tensorpipe'
2025-12-04T13:49:30.3240614Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T13:49:30.3263506Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T13:49:30.3287083Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T13:49:30.3310151Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T13:49:30.3337605Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T13:49:30.3381138Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
2025-12-04T13:49:30.3399848Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3406763Z [command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader
2025-12-04T13:49:30.3425851Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :"
2025-12-04T13:49:30.3644962Z Entering 'android/libs/fbjni'
2025-12-04T13:49:30.3661130Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3684405Z Entering 'third_party/FP16'
2025-12-04T13:49:30.3703340Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3723210Z Entering 'third_party/FXdiv'
2025-12-04T13:49:30.3740809Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3757567Z Entering 'third_party/NNPACK'
2025-12-04T13:49:30.3774816Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3794288Z Entering 'third_party/NVTX'
2025-12-04T13:49:30.3816644Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3836611Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T13:49:30.3851850Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3870396Z Entering 'third_party/XNNPACK'
2025-12-04T13:49:30.3887301Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3917905Z Entering 'third_party/aiter'
2025-12-04T13:49:30.3935308Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3955804Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T13:49:30.3972611Z http.https://github.com/.extraheader
2025-12-04T13:49:30.3997159Z Entering 'third_party/benchmark'
2025-12-04T13:49:30.4019833Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4045619Z Entering 'third_party/composable_kernel'
2025-12-04T13:49:30.4064318Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4098288Z Entering 'third_party/cpp-httplib'
2025-12-04T13:49:30.4116230Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4136216Z Entering 'third_party/cpuinfo'
2025-12-04T13:49:30.4153125Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4172713Z Entering 'third_party/cudnn_frontend'
2025-12-04T13:49:30.4186358Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4204239Z Entering 'third_party/cutlass'
2025-12-04T13:49:30.4226584Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4249480Z Entering 'third_party/fbgemm'
2025-12-04T13:49:30.4265493Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4286428Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T13:49:30.4306702Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4324920Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T13:49:30.4350931Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4373609Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T13:49:30.4394280Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4417557Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T13:49:30.4435551Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4459347Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T13:49:30.4480137Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4506912Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T13:49:30.4527083Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4545500Z Entering 'third_party/fbgemm/external/json'
2025-12-04T13:49:30.4562274Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4588298Z Entering 'third_party/flash-attention'
2025-12-04T13:49:30.4605586Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4624434Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T13:49:30.4640178Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4663970Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T13:49:30.4680993Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4703459Z Entering 'third_party/flatbuffers'
2025-12-04T13:49:30.4726524Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4757134Z Entering 'third_party/fmt'
2025-12-04T13:49:30.4776077Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4796408Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T13:49:30.4814935Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4835971Z Entering 'third_party/gloo'
2025-12-04T13:49:30.4858526Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4876650Z Entering 'third_party/googletest'
2025-12-04T13:49:30.4897324Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4924581Z Entering 'third_party/ideep'
2025-12-04T13:49:30.4943506Z http.https://github.com/.extraheader
2025-12-04T13:49:30.4965769Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T13:49:30.4983594Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5007964Z Entering 'third_party/ittapi'
2025-12-04T13:49:30.5024182Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5045645Z Entering 'third_party/kineto'
2025-12-04T13:49:30.5060618Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5091917Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T13:49:30.5106353Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5135962Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T13:49:30.5152946Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5177603Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T13:49:30.5205342Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5226018Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T13:49:30.5246255Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5276723Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T13:49:30.5293989Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5320119Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T13:49:30.5334121Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5356652Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T13:49:30.5383362Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5404083Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T13:49:30.5418551Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5438842Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T13:49:30.5460431Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5480148Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T13:49:30.5501832Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5522920Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T13:49:30.5538452Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5559706Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:30.5575842Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5597429Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:30.5613868Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5643944Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T13:49:30.5660958Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5679307Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T13:49:30.5695706Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5717574Z Entering 'third_party/kleidiai'
2025-12-04T13:49:30.5732976Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5754138Z Entering 'third_party/mimalloc'
2025-12-04T13:49:30.5773835Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5792630Z Entering 'third_party/nlohmann'
2025-12-04T13:49:30.5808464Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5830647Z Entering 'third_party/onnx'
2025-12-04T13:49:30.5846466Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5875795Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T13:49:30.5897229Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5922578Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T13:49:30.5937505Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5959206Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T13:49:30.5973429Z http.https://github.com/.extraheader
2025-12-04T13:49:30.5994163Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T13:49:30.6010569Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6029074Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T13:49:30.6049452Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6068553Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T13:49:30.6081443Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6100988Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T13:49:30.6115332Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6135119Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T13:49:30.6149989Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6172549Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T13:49:30.6190854Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6208020Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:30.6225652Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6246021Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:30.6265412Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6285485Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T13:49:30.6303979Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6331226Z Entering 'third_party/pocketfft'
2025-12-04T13:49:30.6347018Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6365639Z Entering 'third_party/protobuf'
2025-12-04T13:49:30.6388502Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6416447Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T13:49:30.6433048Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6459134Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T13:49:30.6479461Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6502182Z Entering 'third_party/psimd'
2025-12-04T13:49:30.6524755Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6546479Z Entering 'third_party/pthreadpool'
2025-12-04T13:49:30.6563420Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6590105Z Entering 'third_party/pybind11'
2025-12-04T13:49:30.6612764Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6632786Z Entering 'third_party/python-peachpy'
2025-12-04T13:49:30.6658329Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6686744Z Entering 'third_party/sleef'
2025-12-04T13:49:30.6701468Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6724862Z Entering 'third_party/tensorpipe'
2025-12-04T13:49:30.6739921Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6757460Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T13:49:30.6773382Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6794305Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T13:49:30.6809163Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6829499Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T13:49:30.6842414Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6863432Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T13:49:30.6877146Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6895300Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T13:49:30.6914499Z http.https://github.com/.extraheader
2025-12-04T13:49:30.6959426Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.6986480Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url
2025-12-04T13:49:30.7159803Z Entering 'android/libs/fbjni'
2025-12-04T13:49:30.7177360Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config	remote.origin.url
2025-12-04T13:49:30.7188507Z Entering 'third_party/FP16'
2025-12-04T13:49:30.7200570Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config	remote.origin.url
2025-12-04T13:49:30.7211441Z Entering 'third_party/FXdiv'
2025-12-04T13:49:30.7230886Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config	remote.origin.url
2025-12-04T13:49:30.7241713Z Entering 'third_party/NNPACK'
2025-12-04T13:49:30.7252945Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config	remote.origin.url
2025-12-04T13:49:30.7262989Z Entering 'third_party/NVTX'
2025-12-04T13:49:30.7274455Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config	remote.origin.url
2025-12-04T13:49:30.7284003Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T13:49:30.7295345Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config	remote.origin.url
2025-12-04T13:49:30.7305951Z Entering 'third_party/XNNPACK'
2025-12-04T13:49:30.7317892Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config	remote.origin.url
2025-12-04T13:49:30.7337298Z Entering 'third_party/aiter'
2025-12-04T13:49:30.7349531Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config	remote.origin.url
2025-12-04T13:49:30.7358814Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T13:49:30.7376250Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config	remote.origin.url
2025-12-04T13:49:30.7390310Z Entering 'third_party/benchmark'
2025-12-04T13:49:30.7401789Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T13:49:30.7411943Z Entering 'third_party/composable_kernel'
2025-12-04T13:49:30.7423150Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config	remote.origin.url
2025-12-04T13:49:30.7436653Z Entering 'third_party/cpp-httplib'
2025-12-04T13:49:30.7455552Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config	remote.origin.url
2025-12-04T13:49:30.7466282Z Entering 'third_party/cpuinfo'
2025-12-04T13:49:30.7477127Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config	remote.origin.url
2025-12-04T13:49:30.7486704Z Entering 'third_party/cudnn_frontend'
2025-12-04T13:49:30.7498791Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config	remote.origin.url
2025-12-04T13:49:30.7509141Z Entering 'third_party/cutlass'
2025-12-04T13:49:30.7522759Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config	remote.origin.url
2025-12-04T13:49:30.7540877Z Entering 'third_party/fbgemm'
2025-12-04T13:49:30.7555905Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config	remote.origin.url
2025-12-04T13:49:30.7566362Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T13:49:30.7581359Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config	remote.origin.url
2025-12-04T13:49:30.7591062Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T13:49:30.7602819Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config	remote.origin.url
2025-12-04T13:49:30.7614854Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T13:49:30.7625021Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config	remote.origin.url
2025-12-04T13:49:30.7639180Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T13:49:30.7654118Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config	remote.origin.url
2025-12-04T13:49:30.7671186Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T13:49:30.7683932Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config	remote.origin.url
2025-12-04T13:49:30.7694221Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T13:49:30.7705567Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config	remote.origin.url
2025-12-04T13:49:30.7715413Z Entering 'third_party/fbgemm/external/json'
2025-12-04T13:49:30.7730503Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config	remote.origin.url
2025-12-04T13:49:30.7742168Z Entering 'third_party/flash-attention'
2025-12-04T13:49:30.7756040Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config	remote.origin.url
2025-12-04T13:49:30.7765657Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T13:49:30.7777322Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config	remote.origin.url
2025-12-04T13:49:30.7788943Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T13:49:30.7806964Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config	remote.origin.url
2025-12-04T13:49:30.7821451Z Entering 'third_party/flatbuffers'
2025-12-04T13:49:30.7834192Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config	remote.origin.url
2025-12-04T13:49:30.7845397Z Entering 'third_party/fmt'
2025-12-04T13:49:30.7857265Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config	remote.origin.url
2025-12-04T13:49:30.7867728Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T13:49:30.7878828Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config	remote.origin.url
2025-12-04T13:49:30.7888801Z Entering 'third_party/gloo'
2025-12-04T13:49:30.7900814Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config	remote.origin.url
2025-12-04T13:49:30.7911523Z Entering 'third_party/googletest'
2025-12-04T13:49:30.7922786Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:30.7935301Z Entering 'third_party/ideep'
2025-12-04T13:49:30.7946781Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config	remote.origin.url
2025-12-04T13:49:30.7957165Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T13:49:30.7968829Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config	remote.origin.url
2025-12-04T13:49:30.7981778Z Entering 'third_party/ittapi'
2025-12-04T13:49:30.7993098Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config	remote.origin.url
2025-12-04T13:49:30.8003206Z Entering 'third_party/kineto'
2025-12-04T13:49:30.8015369Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config	remote.origin.url
2025-12-04T13:49:30.8025170Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T13:49:30.8041432Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config	remote.origin.url
2025-12-04T13:49:30.8052029Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T13:49:30.8063538Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config	remote.origin.url
2025-12-04T13:49:30.8073566Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T13:49:30.8095239Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config	remote.origin.url
2025-12-04T13:49:30.8104898Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T13:49:30.8117174Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config	remote.origin.url
2025-12-04T13:49:30.8126915Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T13:49:30.8141278Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config	remote.origin.url
2025-12-04T13:49:30.8155417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T13:49:30.8176854Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config	remote.origin.url
2025-12-04T13:49:30.8188992Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T13:49:30.8200194Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config	remote.origin.url
2025-12-04T13:49:30.8217227Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T13:49:30.8235276Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:30.8246197Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T13:49:30.8257193Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config	remote.origin.url
2025-12-04T13:49:30.8266537Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T13:49:30.8278278Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config	remote.origin.url
2025-12-04T13:49:30.8287417Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T13:49:30.8301215Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T13:49:30.8310996Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:30.8326850Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T13:49:30.8337715Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:30.8351682Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T13:49:30.8365309Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T13:49:30.8378902Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config	remote.origin.url
2025-12-04T13:49:30.8388970Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T13:49:30.8406569Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:30.8418878Z Entering 'third_party/kleidiai'
2025-12-04T13:49:30.8430434Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config	remote.origin.url
2025-12-04T13:49:30.8440933Z Entering 'third_party/mimalloc'
2025-12-04T13:49:30.8452094Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config	remote.origin.url
2025-12-04T13:49:30.8463143Z Entering 'third_party/nlohmann'
2025-12-04T13:49:30.8475765Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config	remote.origin.url
2025-12-04T13:49:30.8492660Z Entering 'third_party/onnx'
2025-12-04T13:49:30.8505055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config	remote.origin.url
2025-12-04T13:49:30.8521078Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T13:49:30.8531760Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T13:49:30.8544568Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T13:49:30.8558514Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config	remote.origin.url
2025-12-04T13:49:30.8568698Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T13:49:30.8580122Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T13:49:30.8589548Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T13:49:30.8600526Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:30.8609933Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T13:49:30.8624629Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config	remote.origin.url
2025-12-04T13:49:30.8633432Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T13:49:30.8644502Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config	remote.origin.url
2025-12-04T13:49:30.8653264Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T13:49:30.8668025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config	remote.origin.url
2025-12-04T13:49:30.8677132Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T13:49:30.8688225Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config	remote.origin.url
2025-12-04T13:49:30.8696274Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T13:49:30.8706207Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T13:49:30.8715064Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:30.8732864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T13:49:30.8743597Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:30.8755451Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T13:49:30.8766720Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T13:49:30.8776562Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config	remote.origin.url
2025-12-04T13:49:30.8793482Z Entering 'third_party/pocketfft'
2025-12-04T13:49:30.8805311Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config	remote.origin.url
2025-12-04T13:49:30.8815602Z Entering 'third_party/protobuf'
2025-12-04T13:49:30.8827485Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config	remote.origin.url
2025-12-04T13:49:30.8838295Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T13:49:30.8848731Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T13:49:30.8858918Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T13:49:30.8874293Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:30.8887343Z Entering 'third_party/psimd'
2025-12-04T13:49:30.8901938Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config	remote.origin.url
2025-12-04T13:49:30.8912114Z Entering 'third_party/pthreadpool'
2025-12-04T13:49:30.8934199Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config	remote.origin.url
2025-12-04T13:49:30.8944465Z Entering 'third_party/pybind11'
2025-12-04T13:49:30.8957878Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T13:49:30.8968310Z Entering 'third_party/python-peachpy'
2025-12-04T13:49:30.8980474Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config	remote.origin.url
2025-12-04T13:49:30.8990096Z Entering 'third_party/sleef'
2025-12-04T13:49:30.9003310Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config	remote.origin.url
2025-12-04T13:49:30.9015049Z Entering 'third_party/tensorpipe'
2025-12-04T13:49:30.9027201Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config	remote.origin.url
2025-12-04T13:49:30.9038577Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T13:49:30.9052714Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:30.9066278Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T13:49:30.9080133Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config	remote.origin.url
2025-12-04T13:49:30.9090048Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T13:49:30.9101363Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config	remote.origin.url
2025-12-04T13:49:30.9110859Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T13:49:30.9124149Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T13:49:30.9133474Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T13:49:30.9143830Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config	remote.origin.url
2025-12-04T13:49:30.9179587Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9200654Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9219563Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9236899Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9254490Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9270871Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9287742Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9306943Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9323412Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9339893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9357070Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9373934Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9390476Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9409506Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9425180Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9440995Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9456779Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9472966Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9492535Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9506857Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9527710Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9545513Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9561106Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9578204Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9593790Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9612736Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9629795Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9645620Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9661812Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9677397Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9693041Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9709399Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9725530Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9741751Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9760313Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9779110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9796044Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9826204Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9834753Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9853404Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9872919Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9897789Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9914533Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9931203Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9947546Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9963913Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:30.9986749Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0003124Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0019990Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0037409Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0054449Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0070128Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0086369Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0103531Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0120306Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0136130Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0151611Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0170141Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0187296Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0205769Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0222536Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0242794Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0259143Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0275893Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0292025Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0309492Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0326043Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0343921Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0361382Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0377952Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0393185Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0409791Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0426825Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0449480Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0465917Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0482552Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0499457Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0519892Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0536242Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0553257Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0571076Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.0675972Z Post job cleanup.
2025-12-04T13:49:31.1142171Z [command]/usr/bin/git version
2025-12-04T13:49:31.1163166Z git version 2.52.0
2025-12-04T13:49:31.1179229Z Copying '/home/runner/.gitconfig' to '/home/runner/_work/_temp/55e9c632-7928-4a62-ba87-db6341ba1ccf/.gitconfig'
2025-12-04T13:49:31.1184344Z Temporarily overriding HOME='/home/runner/_work/_temp/55e9c632-7928-4a62-ba87-db6341ba1ccf' before making global git config changes
2025-12-04T13:49:31.1184800Z Adding repository directory to the temporary git global config as a safe directory
2025-12-04T13:49:31.1186433Z [command]/usr/bin/git config --global --add safe.directory /home/runner/_work/pytorch/pytorch
2025-12-04T13:49:31.1207658Z [command]/usr/bin/git config --local --name-only --get-regexp core\.sshCommand
2025-12-04T13:49:31.1232899Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'core\.sshCommand' && git config --local --unset-all 'core.sshCommand' || :"
2025-12-04T13:49:31.1423488Z Entering 'android/libs/fbjni'
2025-12-04T13:49:31.1449470Z Entering 'third_party/FP16'
2025-12-04T13:49:31.1475253Z Entering 'third_party/FXdiv'
2025-12-04T13:49:31.1501911Z Entering 'third_party/NNPACK'
2025-12-04T13:49:31.1535542Z Entering 'third_party/NVTX'
2025-12-04T13:49:31.1568830Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T13:49:31.1596457Z Entering 'third_party/XNNPACK'
2025-12-04T13:49:31.1629528Z Entering 'third_party/aiter'
2025-12-04T13:49:31.1655352Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T13:49:31.1685138Z Entering 'third_party/benchmark'
2025-12-04T13:49:31.1710051Z Entering 'third_party/composable_kernel'
2025-12-04T13:49:31.1742827Z Entering 'third_party/cpp-httplib'
2025-12-04T13:49:31.1775384Z Entering 'third_party/cpuinfo'
2025-12-04T13:49:31.1806552Z Entering 'third_party/cudnn_frontend'
2025-12-04T13:49:31.1837929Z Entering 'third_party/cutlass'
2025-12-04T13:49:31.1874675Z Entering 'third_party/fbgemm'
2025-12-04T13:49:31.1901768Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T13:49:31.1927677Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T13:49:31.1955597Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T13:49:31.1979710Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T13:49:31.2008546Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T13:49:31.2046172Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T13:49:31.2072932Z Entering 'third_party/fbgemm/external/json'
2025-12-04T13:49:31.2100819Z Entering 'third_party/flash-attention'
2025-12-04T13:49:31.2129542Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T13:49:31.2160269Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T13:49:31.2193852Z Entering 'third_party/flatbuffers'
2025-12-04T13:49:31.2224651Z Entering 'third_party/fmt'
2025-12-04T13:49:31.2251006Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T13:49:31.2275988Z Entering 'third_party/gloo'
2025-12-04T13:49:31.2300043Z Entering 'third_party/googletest'
2025-12-04T13:49:31.2325147Z Entering 'third_party/ideep'
2025-12-04T13:49:31.2356855Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T13:49:31.2386245Z Entering 'third_party/ittapi'
2025-12-04T13:49:31.2412907Z Entering 'third_party/kineto'
2025-12-04T13:49:31.2440306Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T13:49:31.2465549Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T13:49:31.2497319Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T13:49:31.2524413Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T13:49:31.2554270Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T13:49:31.2585701Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T13:49:31.2612206Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T13:49:31.2643508Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T13:49:31.2667385Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T13:49:31.2691715Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T13:49:31.2718829Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T13:49:31.2749523Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:31.2775491Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:31.2808427Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T13:49:31.2845828Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T13:49:31.2878810Z Entering 'third_party/kleidiai'
2025-12-04T13:49:31.2907708Z Entering 'third_party/mimalloc'
2025-12-04T13:49:31.2942861Z Entering 'third_party/nlohmann'
2025-12-04T13:49:31.2978935Z Entering 'third_party/onnx'
2025-12-04T13:49:31.3013159Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T13:49:31.3046541Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T13:49:31.3076951Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T13:49:31.3109396Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T13:49:31.3137869Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T13:49:31.3167352Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T13:49:31.3200898Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T13:49:31.3226007Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T13:49:31.3259418Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T13:49:31.3287254Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:31.3317646Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:31.3344334Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T13:49:31.3377264Z Entering 'third_party/pocketfft'
2025-12-04T13:49:31.3403897Z Entering 'third_party/protobuf'
2025-12-04T13:49:31.3433334Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T13:49:31.3461190Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T13:49:31.3493671Z Entering 'third_party/psimd'
2025-12-04T13:49:31.3522583Z Entering 'third_party/pthreadpool'
2025-12-04T13:49:31.3556561Z Entering 'third_party/pybind11'
2025-12-04T13:49:31.3584722Z Entering 'third_party/python-peachpy'
2025-12-04T13:49:31.3615218Z Entering 'third_party/sleef'
2025-12-04T13:49:31.3646148Z Entering 'third_party/tensorpipe'
2025-12-04T13:49:31.3682055Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T13:49:31.3717326Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T13:49:31.3742920Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T13:49:31.3768911Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T13:49:31.3795094Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T13:49:31.3843885Z [command]/usr/bin/git config --local --name-only --get-regexp http\.https\:\/\/github\.com\/\.extraheader
2025-12-04T13:49:31.3865002Z [command]/usr/bin/git submodule foreach --recursive sh -c "git config --local --name-only --get-regexp 'http\.https\:\/\/github\.com\/\.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :"
2025-12-04T13:49:31.4027175Z Entering 'android/libs/fbjni'
2025-12-04T13:49:31.4052664Z Entering 'third_party/FP16'
2025-12-04T13:49:31.4075790Z Entering 'third_party/FXdiv'
2025-12-04T13:49:31.4100949Z Entering 'third_party/NNPACK'
2025-12-04T13:49:31.4126637Z Entering 'third_party/NVTX'
2025-12-04T13:49:31.4151514Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T13:49:31.4177093Z Entering 'third_party/XNNPACK'
2025-12-04T13:49:31.4207892Z Entering 'third_party/aiter'
2025-12-04T13:49:31.4234974Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T13:49:31.4265629Z Entering 'third_party/benchmark'
2025-12-04T13:49:31.4290574Z Entering 'third_party/composable_kernel'
2025-12-04T13:49:31.4320258Z Entering 'third_party/cpp-httplib'
2025-12-04T13:49:31.4345246Z Entering 'third_party/cpuinfo'
2025-12-04T13:49:31.4369787Z Entering 'third_party/cudnn_frontend'
2025-12-04T13:49:31.4394788Z Entering 'third_party/cutlass'
2025-12-04T13:49:31.4429434Z Entering 'third_party/fbgemm'
2025-12-04T13:49:31.4455771Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T13:49:31.4479186Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T13:49:31.4512287Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T13:49:31.4536465Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T13:49:31.4564435Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T13:49:31.4588946Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T13:49:31.4615437Z Entering 'third_party/fbgemm/external/json'
2025-12-04T13:49:31.4642430Z Entering 'third_party/flash-attention'
2025-12-04T13:49:31.4668545Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T13:49:31.4713335Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T13:49:31.4752245Z Entering 'third_party/flatbuffers'
2025-12-04T13:49:31.4787044Z Entering 'third_party/fmt'
2025-12-04T13:49:31.4820924Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T13:49:31.4851578Z Entering 'third_party/gloo'
2025-12-04T13:49:31.4884998Z Entering 'third_party/googletest'
2025-12-04T13:49:31.4912030Z Entering 'third_party/ideep'
2025-12-04T13:49:31.4939148Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T13:49:31.4970608Z Entering 'third_party/ittapi'
2025-12-04T13:49:31.4996953Z Entering 'third_party/kineto'
2025-12-04T13:49:31.5028861Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T13:49:31.5053817Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T13:49:31.5083691Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T13:49:31.5121342Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T13:49:31.5146339Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T13:49:31.5172382Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T13:49:31.5204106Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T13:49:31.5233374Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T13:49:31.5263643Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T13:49:31.5296019Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T13:49:31.5321782Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T13:49:31.5347945Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:31.5373653Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:31.5408998Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T13:49:31.5436782Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T13:49:31.5463481Z Entering 'third_party/kleidiai'
2025-12-04T13:49:31.5489034Z Entering 'third_party/mimalloc'
2025-12-04T13:49:31.5514347Z Entering 'third_party/nlohmann'
2025-12-04T13:49:31.5540002Z Entering 'third_party/onnx'
2025-12-04T13:49:31.5571318Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T13:49:31.5599735Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T13:49:31.5626203Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T13:49:31.5650131Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T13:49:31.5679328Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T13:49:31.5703718Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T13:49:31.5733538Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T13:49:31.5757035Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T13:49:31.5782513Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T13:49:31.5811476Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:31.5841218Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:31.5871095Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T13:49:31.5907536Z Entering 'third_party/pocketfft'
2025-12-04T13:49:31.5944481Z Entering 'third_party/protobuf'
2025-12-04T13:49:31.5972435Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T13:49:31.6007202Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T13:49:31.6039622Z Entering 'third_party/psimd'
2025-12-04T13:49:31.6066704Z Entering 'third_party/pthreadpool'
2025-12-04T13:49:31.6096359Z Entering 'third_party/pybind11'
2025-12-04T13:49:31.6127340Z Entering 'third_party/python-peachpy'
2025-12-04T13:49:31.6152513Z Entering 'third_party/sleef'
2025-12-04T13:49:31.6179681Z Entering 'third_party/tensorpipe'
2025-12-04T13:49:31.6209191Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T13:49:31.6239573Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T13:49:31.6265444Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T13:49:31.6290492Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T13:49:31.6316452Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T13:49:31.6364937Z [command]/usr/bin/git config --local --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.6389974Z [command]/usr/bin/git submodule foreach --recursive git config --local --show-origin --name-only --get-regexp remote.origin.url
2025-12-04T13:49:31.6590062Z Entering 'android/libs/fbjni'
2025-12-04T13:49:31.6604842Z file:/home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config	remote.origin.url
2025-12-04T13:49:31.6617542Z Entering 'third_party/FP16'
2025-12-04T13:49:31.6630981Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config	remote.origin.url
2025-12-04T13:49:31.6642698Z Entering 'third_party/FXdiv'
2025-12-04T13:49:31.6657326Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config	remote.origin.url
2025-12-04T13:49:31.6667522Z Entering 'third_party/NNPACK'
2025-12-04T13:49:31.6680155Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config	remote.origin.url
2025-12-04T13:49:31.6688503Z Entering 'third_party/NVTX'
2025-12-04T13:49:31.6701606Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config	remote.origin.url
2025-12-04T13:49:31.6711425Z Entering 'third_party/VulkanMemoryAllocator'
2025-12-04T13:49:31.6729852Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config	remote.origin.url
2025-12-04T13:49:31.6737746Z Entering 'third_party/XNNPACK'
2025-12-04T13:49:31.6751249Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config	remote.origin.url
2025-12-04T13:49:31.6767394Z Entering 'third_party/aiter'
2025-12-04T13:49:31.6778537Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config	remote.origin.url
2025-12-04T13:49:31.6788395Z Entering 'third_party/aiter/3rdparty/composable_kernel'
2025-12-04T13:49:31.6798750Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config	remote.origin.url
2025-12-04T13:49:31.6820792Z Entering 'third_party/benchmark'
2025-12-04T13:49:31.6835539Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T13:49:31.6845407Z Entering 'third_party/composable_kernel'
2025-12-04T13:49:31.6859055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config	remote.origin.url
2025-12-04T13:49:31.6876377Z Entering 'third_party/cpp-httplib'
2025-12-04T13:49:31.6888783Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config	remote.origin.url
2025-12-04T13:49:31.6899116Z Entering 'third_party/cpuinfo'
2025-12-04T13:49:31.6910488Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config	remote.origin.url
2025-12-04T13:49:31.6920386Z Entering 'third_party/cudnn_frontend'
2025-12-04T13:49:31.6933241Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config	remote.origin.url
2025-12-04T13:49:31.6942536Z Entering 'third_party/cutlass'
2025-12-04T13:49:31.6957701Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config	remote.origin.url
2025-12-04T13:49:31.6973978Z Entering 'third_party/fbgemm'
2025-12-04T13:49:31.6988963Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config	remote.origin.url
2025-12-04T13:49:31.6999636Z Entering 'third_party/fbgemm/external/asmjit'
2025-12-04T13:49:31.7010588Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config	remote.origin.url
2025-12-04T13:49:31.7020338Z Entering 'third_party/fbgemm/external/composable_kernel'
2025-12-04T13:49:31.7035431Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config	remote.origin.url
2025-12-04T13:49:31.7048108Z Entering 'third_party/fbgemm/external/cpuinfo'
2025-12-04T13:49:31.7060440Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config	remote.origin.url
2025-12-04T13:49:31.7069691Z Entering 'third_party/fbgemm/external/cutlass'
2025-12-04T13:49:31.7086414Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config	remote.origin.url
2025-12-04T13:49:31.7102553Z Entering 'third_party/fbgemm/external/googletest'
2025-12-04T13:49:31.7121591Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config	remote.origin.url
2025-12-04T13:49:31.7129372Z Entering 'third_party/fbgemm/external/hipify_torch'
2025-12-04T13:49:31.7141097Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config	remote.origin.url
2025-12-04T13:49:31.7149698Z Entering 'third_party/fbgemm/external/json'
2025-12-04T13:49:31.7168919Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config	remote.origin.url
2025-12-04T13:49:31.7189278Z Entering 'third_party/flash-attention'
2025-12-04T13:49:31.7205055Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config	remote.origin.url
2025-12-04T13:49:31.7220286Z Entering 'third_party/flash-attention/csrc/composable_kernel'
2025-12-04T13:49:31.7230442Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config	remote.origin.url
2025-12-04T13:49:31.7247547Z Entering 'third_party/flash-attention/csrc/cutlass'
2025-12-04T13:49:31.7258811Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config	remote.origin.url
2025-12-04T13:49:31.7279447Z Entering 'third_party/flatbuffers'
2025-12-04T13:49:31.7292751Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config	remote.origin.url
2025-12-04T13:49:31.7305661Z Entering 'third_party/fmt'
2025-12-04T13:49:31.7319959Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config	remote.origin.url
2025-12-04T13:49:31.7330714Z Entering 'third_party/gemmlowp/gemmlowp'
2025-12-04T13:49:31.7345977Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config	remote.origin.url
2025-12-04T13:49:31.7359231Z Entering 'third_party/gloo'
2025-12-04T13:49:31.7370899Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config	remote.origin.url
2025-12-04T13:49:31.7381763Z Entering 'third_party/googletest'
2025-12-04T13:49:31.7395553Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:31.7410417Z Entering 'third_party/ideep'
2025-12-04T13:49:31.7420957Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config	remote.origin.url
2025-12-04T13:49:31.7429796Z Entering 'third_party/ideep/mkl-dnn'
2025-12-04T13:49:31.7442522Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config	remote.origin.url
2025-12-04T13:49:31.7456103Z Entering 'third_party/ittapi'
2025-12-04T13:49:31.7468864Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config	remote.origin.url
2025-12-04T13:49:31.7481603Z Entering 'third_party/kineto'
2025-12-04T13:49:31.7497720Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config	remote.origin.url
2025-12-04T13:49:31.7510370Z Entering 'third_party/kineto/libkineto/third_party/dynolog'
2025-12-04T13:49:31.7525085Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config	remote.origin.url
2025-12-04T13:49:31.7534999Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/DCGM'
2025-12-04T13:49:31.7548072Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config	remote.origin.url
2025-12-04T13:49:31.7557915Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/cpr'
2025-12-04T13:49:31.7575081Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config	remote.origin.url
2025-12-04T13:49:31.7583948Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/fmt'
2025-12-04T13:49:31.7599725Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config	remote.origin.url
2025-12-04T13:49:31.7610366Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags'
2025-12-04T13:49:31.7624563Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config	remote.origin.url
2025-12-04T13:49:31.7634330Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/gflags/doc'
2025-12-04T13:49:31.7648438Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config	remote.origin.url
2025-12-04T13:49:31.7659718Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/glog'
2025-12-04T13:49:31.7673045Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config	remote.origin.url
2025-12-04T13:49:31.7681426Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/googletest'
2025-12-04T13:49:31.7695371Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:31.7704951Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/json'
2025-12-04T13:49:31.7717235Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config	remote.origin.url
2025-12-04T13:49:31.7726507Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/pfs'
2025-12-04T13:49:31.7742320Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config	remote.origin.url
2025-12-04T13:49:31.7753545Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp'
2025-12-04T13:49:31.7767025Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T13:49:31.7777247Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:31.7789080Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T13:49:31.7805034Z Entering 'third_party/kineto/libkineto/third_party/dynolog/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:31.7820736Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T13:49:31.7838664Z Entering 'third_party/kineto/libkineto/third_party/fmt'
2025-12-04T13:49:31.7851579Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config	remote.origin.url
2025-12-04T13:49:31.7860938Z Entering 'third_party/kineto/libkineto/third_party/googletest'
2025-12-04T13:49:31.7873877Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:31.7886742Z Entering 'third_party/kleidiai'
2025-12-04T13:49:31.7898525Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config	remote.origin.url
2025-12-04T13:49:31.7908747Z Entering 'third_party/mimalloc'
2025-12-04T13:49:31.7921209Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config	remote.origin.url
2025-12-04T13:49:31.7931473Z Entering 'third_party/nlohmann'
2025-12-04T13:49:31.7942979Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config	remote.origin.url
2025-12-04T13:49:31.7953611Z Entering 'third_party/onnx'
2025-12-04T13:49:31.7965206Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config	remote.origin.url
2025-12-04T13:49:31.7981617Z Entering 'third_party/onnx/third_party/pybind11'
2025-12-04T13:49:31.7992662Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T13:49:31.8004966Z Entering 'third_party/opentelemetry-cpp'
2025-12-04T13:49:31.8016638Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config	remote.origin.url
2025-12-04T13:49:31.8025944Z Entering 'third_party/opentelemetry-cpp/third_party/benchmark'
2025-12-04T13:49:31.8038243Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T13:49:31.8050552Z Entering 'third_party/opentelemetry-cpp/third_party/googletest'
2025-12-04T13:49:31.8063468Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:31.8073087Z Entering 'third_party/opentelemetry-cpp/third_party/ms-gsl'
2025-12-04T13:49:31.8087707Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config	remote.origin.url
2025-12-04T13:49:31.8101884Z Entering 'third_party/opentelemetry-cpp/third_party/nlohmann-json'
2025-12-04T13:49:31.8116468Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config	remote.origin.url
2025-12-04T13:49:31.8128901Z Entering 'third_party/opentelemetry-cpp/third_party/opentelemetry-proto'
2025-12-04T13:49:31.8146890Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config	remote.origin.url
2025-12-04T13:49:31.8159141Z Entering 'third_party/opentelemetry-cpp/third_party/opentracing-cpp'
2025-12-04T13:49:31.8174140Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config	remote.origin.url
2025-12-04T13:49:31.8184047Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp'
2025-12-04T13:49:31.8195264Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config	remote.origin.url
2025-12-04T13:49:31.8204886Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/civetweb'
2025-12-04T13:49:31.8217997Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config	remote.origin.url
2025-12-04T13:49:31.8229067Z Entering 'third_party/opentelemetry-cpp/third_party/prometheus-cpp/3rdparty/googletest'
2025-12-04T13:49:31.8241020Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config	remote.origin.url
2025-12-04T13:49:31.8252895Z Entering 'third_party/opentelemetry-cpp/tools/vcpkg'
2025-12-04T13:49:31.8265070Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config	remote.origin.url
2025-12-04T13:49:31.8281126Z Entering 'third_party/pocketfft'
2025-12-04T13:49:31.8292383Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config	remote.origin.url
2025-12-04T13:49:31.8302245Z Entering 'third_party/protobuf'
2025-12-04T13:49:31.8313052Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config	remote.origin.url
2025-12-04T13:49:31.8323758Z Entering 'third_party/protobuf/third_party/benchmark'
2025-12-04T13:49:31.8334347Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config	remote.origin.url
2025-12-04T13:49:31.8347286Z Entering 'third_party/protobuf/third_party/googletest'
2025-12-04T13:49:31.8364260Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:31.8377818Z Entering 'third_party/psimd'
2025-12-04T13:49:31.8389473Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config	remote.origin.url
2025-12-04T13:49:31.8399473Z Entering 'third_party/pthreadpool'
2025-12-04T13:49:31.8413248Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config	remote.origin.url
2025-12-04T13:49:31.8424252Z Entering 'third_party/pybind11'
2025-12-04T13:49:31.8435812Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T13:49:31.8445519Z Entering 'third_party/python-peachpy'
2025-12-04T13:49:31.8457400Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config	remote.origin.url
2025-12-04T13:49:31.8466453Z Entering 'third_party/sleef'
2025-12-04T13:49:31.8477576Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config	remote.origin.url
2025-12-04T13:49:31.8487327Z Entering 'third_party/tensorpipe'
2025-12-04T13:49:31.8499185Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config	remote.origin.url
2025-12-04T13:49:31.8508873Z Entering 'third_party/tensorpipe/third_party/googletest'
2025-12-04T13:49:31.8522847Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config	remote.origin.url
2025-12-04T13:49:31.8532864Z Entering 'third_party/tensorpipe/third_party/libnop'
2025-12-04T13:49:31.8546673Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config	remote.origin.url
2025-12-04T13:49:31.8556125Z Entering 'third_party/tensorpipe/third_party/libuv'
2025-12-04T13:49:31.8577372Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config	remote.origin.url
2025-12-04T13:49:31.8589643Z Entering 'third_party/tensorpipe/third_party/pybind11'
2025-12-04T13:49:31.8605125Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config	remote.origin.url
2025-12-04T13:49:31.8614717Z Entering 'third_party/tensorpipe/third_party/pybind11/tools/clang'
2025-12-04T13:49:31.8626075Z file:/home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config	remote.origin.url
2025-12-04T13:49:31.8656353Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/android/libs/fbjni/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8675301Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FP16/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8693071Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/FXdiv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8708489Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8724105Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NVTX/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8744796Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/VulkanMemoryAllocator/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8760106Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/XNNPACK/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8782424Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8789112Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/aiter/modules/3rdparty/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8802580Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8819471Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8834542Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpp-httplib/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8850397Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8869545Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cudnn_frontend/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8887162Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8901871Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8916661Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/asmjit/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8931146Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8950786Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cpuinfo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8965771Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8981843Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.8999963Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/hipify_torch/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9014805Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fbgemm/modules/external/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9029085Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9043688Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/composable_kernel/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9063864Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flash-attention/modules/csrc/cutlass/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9079554Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/flatbuffers/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9095644Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9111175Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gemmlowp/gemmlowp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9125690Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/gloo/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9140685Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9155174Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9169624Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ideep/modules/mkl-dnn/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9184894Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/ittapi/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9203256Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9217923Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9241659Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/DCGM/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9259034Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/cpr/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9275262Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9291882Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9307689Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/gflags/modules/doc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9324485Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/glog/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9343586Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9359110Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9374078Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/pfs/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9390102Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9407941Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9424987Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/dynolog/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9439898Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/fmt/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9457065Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kineto/modules/libkineto/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9471989Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/kleidiai/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9486584Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/mimalloc/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9501061Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/nlohmann/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9515529Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9530016Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/onnx/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9545267Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9560929Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9575845Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9590980Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/ms-gsl/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9615803Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/nlohmann-json/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9632391Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentelemetry-proto/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9645849Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/opentracing-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9669828Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9696348Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/civetweb/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9726842Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/third_party/prometheus-cpp/modules/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9745751Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/opentelemetry-cpp/modules/tools/vcpkg/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9760013Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pocketfft/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9775465Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9789795Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/benchmark/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9804645Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/protobuf/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9819501Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/psimd/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9833880Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/NNPACK_deps/pthreadpool/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9848344Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9863176Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/python-peachpy/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9877009Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/sleef/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9891743Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9906186Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/googletest/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9920706Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libnop/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9934021Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/libuv/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9951207Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:31.9965699Z [command]/usr/bin/git config --file /home/runner/_work/pytorch/pytorch/.git/modules/third_party/tensorpipe/modules/third_party/pybind11/modules/tools/clang/config --name-only --get-regexp ^includeIf\.gitdir:
2025-12-04T13:49:32.0072842Z Cleaning up orphan processes